diff --git a/libtommath-tex.patch b/libtommath-tex.patch deleted file mode 100644 index 712a971..0000000 --- a/libtommath-tex.patch +++ /dev/null @@ -1,1791 +0,0 @@ -diff -Naur libtommath-1.0.old/bn.tex libtommath-1.0/bn.tex ---- libtommath-1.0.old/bn.tex 2016-02-23 11:24:13.632735129 +0100 -+++ libtommath-1.0/bn.tex 2016-02-23 11:43:13.580826898 +0100 -@@ -258,7 +258,7 @@ - So you may be thinking ``should I use LibTomMath?'' and the answer is a definite maybe. Let me tabulate what I think - are the pros and cons of LibTomMath by comparing it to the math routines from GnuPG\footnote{GnuPG v1.2.3 versus LibTomMath v0.28}. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|l|c|c|l|} -@@ -300,7 +300,7 @@ - There are three possible return codes a function may return. - - \index{MP\_OKAY}\index{MP\_YES}\index{MP\_NO}\index{MP\_VAL}\index{MP\_MEM} --\begin{figure}[here!] -+\begin{figure}[h!] - \begin{center} - \begin{small} - \begin{tabular}{|l|l|} -@@ -823,7 +823,7 @@ - for any comparison. - - \index{MP\_GT} \index{MP\_EQ} \index{MP\_LT} --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{|c|c|} - \hline \textbf{Result Code} & \textbf{Meaning} \\ -@@ -1290,7 +1290,7 @@ - \end{alltt} - Where ``XXX'' is one of the following entries from the table \ref{fig:tuning}. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - \begin{tabular}{|l|l|} -@@ -1739,7 +1739,7 @@ - (see fig. \ref{fig:primeopts}) which can be OR'ed together. The callback parameters are used as in - mp\_prime\_random(). - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - \begin{tabular}{|r|l|} -diff -Naur libtommath-1.0.old/tommath.src libtommath-1.0/tommath.src ---- libtommath-1.0.old/tommath.src 2016-02-03 19:07:27.000000000 +0100 -+++ libtommath-1.0/tommath.src 2016-02-23 11:46:48.428479620 +0100 -@@ -175,7 +175,7 @@ - typical RSA modulus would be at least greater than $10^{309}$. However, modern programming languages such as ISO C \cite{ISOC} and - Java \cite{JAVA} only provide instrinsic support for integers which are relatively small and single precision. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{center} - \begin{tabular}{|r|c|} - \hline \textbf{Data Type} & \textbf{Range} \\ -@@ -366,7 +366,7 @@ - exercises ranges from one (the easiest) to five (the hardest). The following table sumarizes the - scoring system used. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - \begin{tabular}{|c|l|} -@@ -573,7 +573,7 @@ - used within LibTomMath. - - \index{mp\_int} --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - %\begin{verbatim} -@@ -670,7 +670,7 @@ - \textbf{int} data type with one of the following values (fig \ref{fig:errcodes}). - - \index{MP\_OKAY} \index{MP\_VAL} \index{MP\_MEM} --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{|l|l|} - \hline \textbf{Value} & \textbf{Meaning} \\ -@@ -719,7 +719,7 @@ - structure are set to valid values. The mp\_init algorithm will perform such an action. - - \index{mp\_init} --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_init}. \\ -@@ -793,7 +793,7 @@ - When an mp\_int is no longer required by the application, the memory that has been allocated for its digits must be - returned to the application's memory pool with the mp\_clear algorithm. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_clear}. \\ -@@ -857,7 +857,7 @@ - is large enough to simply increase the \textbf{used} digit count. However, when the size of the array is too small it - must be re-sized appropriately to accomodate the result. The mp\_grow algorithm will provide this functionality. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_grow}. \\ -@@ -911,7 +911,7 @@ - of input mp\_ints to a given algorithm. The purpose of algorithm mp\_init\_size is similar to mp\_init except that it - will allocate \textit{at least} a specified number of digits. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -963,7 +963,7 @@ - The purpose of algorithm mp\_init\_multi is to initialize a variable length array of mp\_int structures in a single - statement. It is essentially a shortcut to multiple initializations. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_init\_multi}. \\ -@@ -1022,7 +1022,7 @@ - positive number which means that if the \textbf{used} count is decremented to zero, the sign must be set to - \textbf{MP\_ZPOS}. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_clamp}. \\ -@@ -1090,7 +1090,7 @@ - a copy for the purposes of this text. The copy of the mp\_int will be a separate entity that represents the same - value as the mp\_int it was copied from. The mp\_copy algorithm provides this functionality. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_copy}. \\ -@@ -1205,7 +1205,7 @@ - useful within functions that need to modify an argument but do not wish to actually modify the original copy. The - mp\_init\_copy algorithm has been designed to help perform this task. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_init\_copy}. \\ -@@ -1235,7 +1235,7 @@ - Reseting an mp\_int to the default state is a common step in many algorithms. The mp\_zero algorithm will be the algorithm used to - perform this task. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_zero}. \\ -@@ -1265,7 +1265,7 @@ - With the mp\_int representation of an integer, calculating the absolute value is trivial. The mp\_abs algorithm will compute - the absolute value of an mp\_int. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_abs}. \\ -@@ -1297,7 +1297,7 @@ - With the mp\_int representation of an integer, calculating the negation is also trivial. The mp\_neg algorithm will compute - the negative of an mp\_int input. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_neg}. \\ -@@ -1334,7 +1334,7 @@ - \subsection{Setting Small Constants} - Often a mp\_int must be set to a relatively small value such as $1$ or $2$. For these cases the mp\_set algorithm is useful. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_set}. \\ -@@ -1375,7 +1375,7 @@ - To overcome the limitations of the mp\_set algorithm the mp\_set\_int algorithm is ideal. It accepts a ``long'' - data type as input and will always treat it as a 32-bit integer. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_set\_int}. \\ -@@ -1425,7 +1425,7 @@ - - To facilitate working with the results of the comparison functions three constants are required. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{|r|l|} - \hline \textbf{Constant} & \textbf{Meaning} \\ -@@ -1438,7 +1438,7 @@ - \caption{Comparison Return Codes} - \end{figure} - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_cmp\_mag}. \\ -@@ -1479,7 +1479,7 @@ - Comparing with sign considerations is also fairly critical in several routines (\textit{division for example}). Based on an unsigned magnitude - comparison a trivial signed comparison algorithm can be written. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_cmp}. \\ -@@ -1560,7 +1560,7 @@ - Historically that convention stems from the MPI library where ``s\_'' stood for static functions that were hidden from the developer entirely. - - \newpage --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{center} - \begin{small} - \begin{tabular}{l} -@@ -1663,7 +1663,7 @@ - For example, the default for LibTomMath is to use a ``unsigned long'' for the mp\_digit ``type'' while $\beta = 2^{28}$. In ISO C an ``unsigned long'' - data type must be able to represent $0 \le x < 2^{32}$ meaning that in this case $\gamma \ge 32$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{center} - \begin{small} - \begin{tabular}{l} -@@ -1754,7 +1754,7 @@ - Recall from section 5.2 that an mp\_int represents an integer with an unsigned mantissa (\textit{the array of digits}) and a \textbf{sign} - flag. A high level addition is actually performed as a series of eight separate cases which can be optimized down to three unique cases. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_add}. \\ -@@ -1783,7 +1783,7 @@ - either \cite{TAOCPV2} or \cite{HAC} since they both only provide unsigned operations. The algorithm is fairly - straightforward but restricted since subtraction can only produce positive results. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|c|c|c|c|c|} -@@ -1833,7 +1833,7 @@ - \subsection{High Level Subtraction} - The high level signed subtraction algorithm is essentially the same as the high level signed addition algorithm. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_sub}. \\ -@@ -1865,7 +1865,7 @@ - \cite{HAC}. Also this algorithm is restricted by algorithm s\_mp\_sub. Chart \ref{fig:SubChart} lists the eight possible inputs and - the operations required. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{|c|c|c|c|c|} -@@ -1911,7 +1911,7 @@ - In a binary system where the radix is a power of two multiplication by two not only arises often in other algorithms it is a fairly efficient - operation to perform. A single precision logical shift left is sufficient to multiply a single digit by two. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -1967,7 +1967,7 @@ - \subsection{Division by Two} - A division by two can just as easily be accomplished with a logical shift right as multiplication by two can be with a logical shift left. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2024,7 +2024,7 @@ - degree. In this case $f(x) \cdot x = a_n x^{n+1} + a_{n-1} x^n + ... + a_0 x$. From a scalar basis point of view multiplying by $x$ is equivalent to - multiplying by the integer $\beta$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2080,7 +2080,7 @@ - - Division by powers of $x$ is easily achieved by shifting the digits right and removing any that will end up to the right of the zero'th digit. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2136,7 +2136,7 @@ - - \subsection{Multiplication by Power of Two} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2199,7 +2199,7 @@ - - \subsection{Division by Power of Two} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2252,7 +2252,7 @@ - The last algorithm in the series of polynomial basis power of two algorithms is calculating the remainder of division by $2^b$. This - algorithm benefits from the fact that in twos complement arithmetic $a \mbox{ (mod }2^b\mbox{)}$ is the same as $a$ AND $2^b - 1$. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2363,7 +2363,7 @@ - include $\alpha$ which shall represent the number of bits in the type \textbf{mp\_word}. This implies that $2^{\alpha} > 2 \cdot \beta^2$. The - constant $\delta = 2^{\alpha - 2lg(\beta)}$ will represent the maximal weight of any column in a product (\textit{see ~COMBA~ for more information}). - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2422,7 +2422,7 @@ - For example, consider multiplying $576$ by $241$. That is equivalent to computing $10^0(1)(576) + 10^1(4)(576) + 10^2(2)(576)$ which is best - visualized in the following table. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{|c|c|c|c|c|c|l|} - \hline && & 5 & 7 & 6 & \\ -@@ -2500,7 +2500,7 @@ - Where $\vec x_n$ is the $n'th$ column of the output vector. Consider the following example which computes the vector $\vec x$ for the multiplication - of $576$ and $241$. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|c|c|c|c|c|c|} -@@ -2521,7 +2521,7 @@ - Now the columns must be fixed by propagating the carry upwards. The resultant vector will have one extra dimension over the input vector which is - congruent to adding a leading zero digit. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2577,7 +2577,7 @@ - the smaller input may not have more than $256$ digits if the Comba method is to be used. This is quite satisfactory for most applications since - $256$ digits would allow for numbers in the range of $0 \le x < 2^{7168}$ which, is much larger than most public key cryptographic algorithms require. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2783,7 +2783,7 @@ - of this system of equations has made Karatsuba fairly popular. In fact the cutoff point is often fairly low\footnote{With LibTomMath 0.18 it is 70 and 109 digits for the Intel P4 and AMD Athlon respectively.} - making it an ideal algorithm to speed up certain public key cryptosystems such as RSA and Diffie-Hellman. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2888,7 +2888,7 @@ - the algorithm can be faster than a baseline multiplication. However, the greater complexity of this algorithm places the cutoff point - (\textbf{TOOM\_MUL\_CUTOFF}) where Toom-Cook becomes more efficient much higher than the Karatsuba cutoff point. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2925,7 +2925,7 @@ - \caption{Algorithm mp\_toom\_mul} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2997,7 +2997,7 @@ - Now that algorithms to handle multiplications of every useful dimensions have been developed, a rather simple finishing touch is required. So far all - of the multiplication algorithms have been unsigned multiplications which leaves only a signed multiplication algorithm to be established. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3051,7 +3051,7 @@ - For any $n$-digit input, there are ${{\left (n^2 + n \right)}\over 2}$ possible unique single precision multiplications required compared to the $n^2$ - required for multiplication. The following diagram gives an example of the operations required. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{ccccc|c} - &&1&2&3&\\ -@@ -3080,7 +3080,7 @@ - The baseline squaring algorithm is meant to be a catch-all squaring algorithm. It will handle any of the input sizes that the faster routines - will not handle. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3159,7 +3159,7 @@ - mp\_word arrays. One array will hold the squares and the other array will hold the double products. With both arrays the doubling and - carry propagation can be moved to a $O(n)$ work level outside the $O(n^2)$ level. In this case, we have an even simpler solution in mind. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3249,7 +3249,7 @@ - The 100 digit halves will not be squared using Karatsuba, but instead using the faster Comba based squaring algorithm. If Karatsuba multiplication - were used instead, the 100 digit numbers would be squared with a slower Comba based multiplication. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3342,7 +3342,7 @@ - derive their own Toom-Cook squaring algorithm. - - \subsection{High Level Squaring} --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3560,7 +3560,7 @@ - is considerably faster than the straightforward $3m^2$ method. - - \subsection{The Barrett Algorithm} --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3634,7 +3634,7 @@ - In order to use algorithm mp\_reduce the value of $\mu$ must be calculated in advance. Ideally this value should be computed once and stored for - future use so that the Barrett algorithm can be used without delay. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3680,7 +3680,7 @@ - - From these two simple facts the following simple algorithm can be derived. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3706,7 +3706,7 @@ - final result of the Montgomery algorithm. If $k > lg(n)$ and $0 \le x < n^2$ then the final result is limited to - $0 \le r < \lfloor x/2^k \rfloor + n$. As a result at most a single subtraction is required to get the residue desired. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|c|l|} -@@ -3736,7 +3736,7 @@ - and $k^2$ single precision additions. At this rate the algorithm is most certainly slower than Barrett reduction and not terribly useful. - Fortunately there exists an alternative representation of the algorithm. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3758,7 +3758,7 @@ - This algorithm is equivalent since $2^tn$ is a multiple of $n$ and the lower $k$ bits of $x$ are zero by step 2. The number of single - precision shifts has now been reduced from $2k^2$ to $k^2 + k$ which is only a small improvement. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|c|l|r|} -@@ -3791,7 +3791,7 @@ - Instead of computing the reduction on a bit-by-bit basis it is actually much faster to compute it on digit-by-digit basis. Consider the - previous algorithm re-written to compute the Montgomery reduction in this new fashion. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3849,7 +3849,7 @@ - The baseline Montgomery reduction algorithm will produce the residue for any size input. It is designed to be a catch-all algororithm for - Montgomery reductions. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3933,7 +3933,7 @@ - With this change in place the Montgomery reduction algorithm can be performed with a Comba style multiplication loop which substantially increases - the speed of the algorithm. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4014,7 +4014,7 @@ - \subsection{Montgomery Setup} - To calculate the variable $\rho$ a relatively simple algorithm will be required. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4070,7 +4070,7 @@ - The variable $n$ reduces modulo $n - k$ to $k$. By putting $q = \lfloor x/n \rfloor$ and $r = x \mbox{ mod } n$ - into the equation the original congruence is reproduced, thus concluding the proof. The following algorithm is based on this observation. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4183,7 +4183,7 @@ - of $x$ and $q$. The resulting algorithm is very efficient and can lead to substantial improvements over Barrett and Montgomery reduction when modular - exponentiations are performed. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4249,7 +4249,7 @@ - To setup the restricted Diminished Radix algorithm the value $k = \beta - n_0$ is required. This algorithm is not really complicated but provided for - completeness. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4271,7 +4271,7 @@ - Another algorithm which will be useful is the ability to detect a restricted Diminished Radix modulus. An integer is said to be - of restricted Diminished Radix form if all of the digits are equal to $\beta - 1$ except the trailing digit which may be any value. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4304,7 +4304,7 @@ - In general the restricted Diminished Radix reduction algorithm is much faster since it has considerably lower overhead. However, this new - algorithm is much faster than either Montgomery or Barrett reduction when the moduli are of the appropriate form. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4346,7 +4346,7 @@ - \subsubsection{Unrestricted Setup} - To setup this reduction algorithm the value of $k = 2^p - n$ is required. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4385,7 +4385,7 @@ - that there will be value of $k$ that when added to the modulus causes a carry in the first digit which propagates all the way to the most - significant bit. The resulting sum will be a power of two. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4487,7 +4487,7 @@ - While this current method is a considerable speed up there are further improvements to be made. For example, the $a^{2^i}$ term does not need to - be computed in an auxilary variable. Consider the following equivalent algorithm. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4539,7 +4539,7 @@ - to be used when a small power of an input is required (\textit{e.g. $a^5$}). It is faster than simply multiplying $b - 1$ times for all values of - $b$ that are greater than three. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4590,7 +4590,7 @@ - computes the same exponentiation. A group of $k$ bits from the exponent is called a \textit{window}. That is it is a small window on only a - portion of the entire exponent. Consider the following modification to the basic left to right exponentiation algorithm. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4626,7 +4626,7 @@ - approach is to brute force search amongst the values $k = 2, 3, \ldots, 8$ for the lowest result. Table~\ref{fig:OPTK} lists optimal values of $k$ - for various exponent sizes and compares the number of multiplication and squarings required against algorithm~\ref{fig:LTOR}. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - \begin{tabular}{|c|c|c|c|c|c|} -@@ -4655,7 +4655,7 @@ - - Table~\ref{fig:OPTK2} lists optimal values of $k$ for various exponent sizes and compares the work required against algorithm {\ref{fig:KARY}}. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - \begin{tabular}{|c|c|c|c|c|c|} -@@ -4677,7 +4677,7 @@ - \label{fig:OPTK2} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4728,7 +4728,7 @@ - value of $(1/a) \mbox{ mod }c$ is computed using the modular inverse (\textit{see \ref{sec;modinv}}). If no inverse exists the algorithm - terminates with an error. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4780,7 +4780,7 @@ - - \subsection{Barrett Modular Exponentiation} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4830,7 +4830,7 @@ - \caption{Algorithm s\_mp\_exptmod} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4944,7 +4944,7 @@ - Calculating $b = 2^a$ can be performed much quicker than with any of the previous algorithms. Recall that a logical shift left $m << k$ is - equivalent to $m \cdot 2^k$. By this logic when $m = 1$ a quick power of two can be achieved. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4986,7 +4986,7 @@ - will be used. Let $x$ represent the divisor and $y$ represent the dividend. Let $q$ represent the integer quotient $\lfloor y / x \rfloor$ and - let $r$ represent the remainder $r = y - x \lfloor y / x \rfloor$. The following simple algorithm will be used to start the discussion. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5082,7 +5082,7 @@ - At most the quotient approaches $2\beta$, however, in practice this will not occur since that would imply the previous quotient digit was too small. - - \subsection{Radix-$\beta$ Division with Remainder} --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5125,7 +5125,7 @@ - \caption{Algorithm mp\_div} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5248,7 +5248,7 @@ - Both addition and subtraction are performed by ``cheating'' and using mp\_set followed by the higher level addition or subtraction - algorithms. As a result these algorithms are subtantially simpler with a slight cost in performance. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5281,7 +5281,7 @@ - multiplication algorithm. Essentially this algorithm is a modified version of algorithm s\_mp\_mul\_digs where one of the multiplicands - only has one digit. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5323,7 +5323,7 @@ - Like the single digit multiplication algorithm, single digit division is also a fairly common algorithm used in radix conversion. Since the - divisor is only a single digit a specialized variant of the division algorithm can be used to compute the quotient. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5388,7 +5388,7 @@ - such as the real numbers. As a result the root found can be above the true root by few and must be manually adjusted. Ideally at the end of the - algorithm the $n$'th root $b$ of an integer $a$ is desired such that $b^n \le a$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5441,7 +5441,7 @@ - factoring for example, can make use of random values as starting points to find factors of a composite integer. In this case the algorithm presented - is solely for simulations and not intended for cryptographic use. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5483,7 +5483,7 @@ - such that they are printable. While outputting as base64 may not be too helpful for human operators it does allow communication via non binary - mediums. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{cc|cc|cc|cc} - \hline \textbf{Value} & \textbf{Char} & \textbf{Value} & \textbf{Char} & \textbf{Value} & \textbf{Char} & \textbf{Value} & \textbf{Char} \\ -@@ -5511,7 +5511,7 @@ - \label{fig:ASC} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5551,7 +5551,7 @@ - \subsection{Generating Radix-$n$ Output} - Generating radix-$n$ output is fairly trivial with a division and remainder algorithm. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5620,7 +5620,7 @@ - The most common approach (cite) is to reduce one input modulo another. That is if $a$ and $b$ are divisible by some integer $k$ and if $qa + r = b$ then - $r$ is also divisible by $k$. The reduction pattern follows $\left < a , b \right > \rightarrow \left < b, a \mbox{ mod } b \right >$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5646,7 +5646,7 @@ - greatest common divisors. The faster approach is based on the observation that if $k$ divides both $a$ and $b$ it will also divide $a - b$. - In particular, we would like $a - b$ to decrease in magnitude which implies that $b \ge a$. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5680,7 +5680,7 @@ - However, instead of factoring $b - a$ to find a suitable value of $p$ the powers of $p$ can be removed from $a$ and $b$ that are in common first. - Then inside the loop whenever $b - a$ is divisible by some power of $p$ it can be safely removed. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5725,7 +5725,7 @@ - The algorithms presented so far cannot handle inputs which are zero or negative. The following algorithm can handle all input cases properly - and will produce the greatest common divisor. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5812,7 +5812,7 @@ - Linear Feedback Shift Registers (LFSR) tend to use registers with periods which are co-prime (\textit{e.g. the greatest common divisor is one.}). - Similarly in number theory if a composite $n$ has two prime factors $p$ and $q$ then maximal order of any unit of $\Z/n\Z$ will be $[ p - 1, q - 1] $. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5935,7 +5935,7 @@ - factors of $p$ do not have to be known. Furthermore, if $(a, p) = 1$ then the algorithm will terminate when the recursion requests the - Jacobi symbol computation of $\left ( {1 \over a'} \right )$ which is simply $1$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6040,7 +6040,7 @@ - equation. - - \subsection{General Case} --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6150,7 +6150,7 @@ - approximately $80\%$ of all candidate integers. The constant \textbf{PRIME\_SIZE} is equal to the number of primes in the test base. The - array \_\_prime\_tab is an array of the first \textbf{PRIME\_SIZE} prime numbers. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6196,7 +6196,7 @@ - integers known as Carmichael numbers will be a pseudo-prime to all valid bases. Fortunately such numbers are extremely rare as $n$ grows - in size. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6228,7 +6228,7 @@ - value must be equal to $-1$. The squarings are stopped as soon as $-1$ is observed. If the value of $1$ is observed first it means that - some value not congruent to $\pm 1$ when squared equals one which cannot occur if $n$ is prime. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -diff -Naur libtommath-1.0.old/tommath.tex libtommath-1.0/tommath.tex ---- libtommath-1.0.old/tommath.tex 2016-02-23 11:24:13.648735326 +0100 -+++ libtommath-1.0/tommath.tex 2016-02-23 11:43:13.591827034 +0100 -@@ -175,7 +175,7 @@ - typical RSA modulus would be at least greater than $10^{309}$. However, modern programming languages such as ISO C \cite{ISOC} and - Java \cite{JAVA} only provide instrinsic support for integers which are relatively small and single precision. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{center} - \begin{tabular}{|r|c|} - \hline \textbf{Data Type} & \textbf{Range} \\ -@@ -366,7 +366,7 @@ - exercises ranges from one (the easiest) to five (the hardest). The following table sumarizes the - scoring system used. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - \begin{tabular}{|c|l|} -@@ -534,7 +534,7 @@ - for new algorithms. This methodology allows new algorithms to be tested in a complete framework with relative ease. - - \begin{center} --\begin{figure}[here] -+\begin{figure}[h] - \includegraphics{pics/design_process.ps} - \caption{Design Flow of the First Few Original LibTomMath Functions.} - \label{pic:design_process} -@@ -579,7 +579,7 @@ - used within LibTomMath. - - \index{mp\_int} --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - %\begin{verbatim} -@@ -676,7 +676,7 @@ - \textbf{int} data type with one of the following values (fig \ref{fig:errcodes}). - - \index{MP\_OKAY} \index{MP\_VAL} \index{MP\_MEM} --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{|l|l|} - \hline \textbf{Value} & \textbf{Meaning} \\ -@@ -725,7 +725,7 @@ - structure are set to valid values. The mp\_init algorithm will perform such an action. - - \index{mp\_init} --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_init}. \\ -@@ -831,7 +831,7 @@ - When an mp\_int is no longer required by the application, the memory that has been allocated for its digits must be - returned to the application's memory pool with the mp\_clear algorithm. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_clear}. \\ -@@ -925,7 +925,7 @@ - is large enough to simply increase the \textbf{used} digit count. However, when the size of the array is too small it - must be re-sized appropriately to accomodate the result. The mp\_grow algorithm will provide this functionality. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_grow}. \\ -@@ -1022,7 +1022,7 @@ - of input mp\_ints to a given algorithm. The purpose of algorithm mp\_init\_size is similar to mp\_init except that it - will allocate \textit{at least} a specified number of digits. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -1108,7 +1108,7 @@ - The purpose of algorithm mp\_init\_multi is to initialize a variable length array of mp\_int structures in a single - statement. It is essentially a shortcut to multiple initializations. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_init\_multi}. \\ -@@ -1212,7 +1212,7 @@ - positive number which means that if the \textbf{used} count is decremented to zero, the sign must be set to - \textbf{MP\_ZPOS}. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_clamp}. \\ -@@ -1310,7 +1310,7 @@ - a copy for the purposes of this text. The copy of the mp\_int will be a separate entity that represents the same - value as the mp\_int it was copied from. The mp\_copy algorithm provides this functionality. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_copy}. \\ -@@ -1479,7 +1479,7 @@ - useful within functions that need to modify an argument but do not wish to actually modify the original copy. The - mp\_init\_copy algorithm has been designed to help perform this task. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_init\_copy}. \\ -@@ -1527,7 +1527,7 @@ - Reseting an mp\_int to the default state is a common step in many algorithms. The mp\_zero algorithm will be the algorithm used to - perform this task. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_zero}. \\ -@@ -1579,7 +1579,7 @@ - With the mp\_int representation of an integer, calculating the absolute value is trivial. The mp\_abs algorithm will compute - the absolute value of an mp\_int. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_abs}. \\ -@@ -1640,7 +1640,7 @@ - With the mp\_int representation of an integer, calculating the negation is also trivial. The mp\_neg algorithm will compute - the negative of an mp\_int input. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_neg}. \\ -@@ -1703,7 +1703,7 @@ - \subsection{Setting Small Constants} - Often a mp\_int must be set to a relatively small value such as $1$ or $2$. For these cases the mp\_set algorithm is useful. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_set}. \\ -@@ -1759,7 +1759,7 @@ - To overcome the limitations of the mp\_set algorithm the mp\_set\_int algorithm is ideal. It accepts a ``long'' - data type as input and will always treat it as a 32-bit integer. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_set\_int}. \\ -@@ -1843,7 +1843,7 @@ - - To facilitate working with the results of the comparison functions three constants are required. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{|r|l|} - \hline \textbf{Constant} & \textbf{Meaning} \\ -@@ -1856,7 +1856,7 @@ - \caption{Comparison Return Codes} - \end{figure} - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_cmp\_mag}. \\ -@@ -1938,7 +1938,7 @@ - Comparing with sign considerations is also fairly critical in several routines (\textit{division for example}). Based on an unsigned magnitude - comparison a trivial signed comparison algorithm can be written. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_cmp}. \\ -@@ -2047,7 +2047,7 @@ - Historically that convention stems from the MPI library where ``s\_'' stood for static functions that were hidden from the developer entirely. - - \newpage --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{center} - \begin{small} - \begin{tabular}{l} -@@ -2244,7 +2244,7 @@ - For example, the default for LibTomMath is to use a ``unsigned long'' for the mp\_digit ``type'' while $\beta = 2^{28}$. In ISO C an ``unsigned long'' - data type must be able to represent $0 \le x < 2^{32}$ meaning that in this case $\gamma \ge 32$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{center} - \begin{small} - \begin{tabular}{l} -@@ -2411,7 +2411,7 @@ - Recall from section 5.2 that an mp\_int represents an integer with an unsigned mantissa (\textit{the array of digits}) and a \textbf{sign} - flag. A high level addition is actually performed as a series of eight separate cases which can be optimized down to three unique cases. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_add}. \\ -@@ -2440,7 +2440,7 @@ - either \cite{TAOCPV2} or \cite{HAC} since they both only provide unsigned operations. The algorithm is fairly - straightforward but restricted since subtraction can only produce positive results. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|c|c|c|c|c|} -@@ -2529,7 +2529,7 @@ - \subsection{High Level Subtraction} - The high level signed subtraction algorithm is essentially the same as the high level signed addition algorithm. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{center} - \begin{tabular}{l} - \hline Algorithm \textbf{mp\_sub}. \\ -@@ -2561,7 +2561,7 @@ - \cite{HAC}. Also this algorithm is restricted by algorithm s\_mp\_sub. Chart \ref{fig:SubChart} lists the eight possible inputs and - the operations required. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{|c|c|c|c|c|} -@@ -2651,7 +2651,7 @@ - In a binary system where the radix is a power of two multiplication by two not only arises often in other algorithms it is a fairly efficient - operation to perform. A single precision logical shift left is sufficient to multiply a single digit by two. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2775,7 +2775,7 @@ - \subsection{Division by Two} - A division by two can just as easily be accomplished with a logical shift right as multiplication by two can be with a logical shift left. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2886,7 +2886,7 @@ - degree. In this case $f(x) \cdot x = a_n x^{n+1} + a_{n-1} x^n + ... + a_0 x$. From a scalar basis point of view multiplying by $x$ is equivalent to - multiplying by the integer $\beta$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -2929,7 +2929,7 @@ - - \newpage - \begin{center} --\begin{figure}[here] -+\begin{figure}[h] - \includegraphics{pics/sliding_window.ps} - \caption{Sliding Window Movement} - \label{pic:sliding_window} -@@ -3001,7 +3001,7 @@ - - Division by powers of $x$ is easily achieved by shifting the digits right and removing any that will end up to the right of the zero'th digit. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3115,7 +3115,7 @@ - - \subsection{Multiplication by Power of Two} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3249,7 +3249,7 @@ - - \subsection{Division by Power of Two} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3387,7 +3387,7 @@ - The last algorithm in the series of polynomial basis power of two algorithms is calculating the remainder of division by $2^b$. This - algorithm benefits from the fact that in twos complement arithmetic $a \mbox{ (mod }2^b\mbox{)}$ is the same as $a$ AND $2^b - 1$. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3541,7 +3541,7 @@ - include $\alpha$ which shall represent the number of bits in the type \textbf{mp\_word}. This implies that $2^{\alpha} > 2 \cdot \beta^2$. The - constant $\delta = 2^{\alpha - 2lg(\beta)}$ will represent the maximal weight of any column in a product (\textit{see sub-section 5.2.2 for more information}). - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3600,7 +3600,7 @@ - For example, consider multiplying $576$ by $241$. That is equivalent to computing $10^0(1)(576) + 10^1(4)(576) + 10^2(2)(576)$ which is best - visualized in the following table. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{|c|c|c|c|c|c|l|} - \hline && & 5 & 7 & 6 & \\ -@@ -3753,7 +3753,7 @@ - Where $\vec x_n$ is the $n'th$ column of the output vector. Consider the following example which computes the vector $\vec x$ for the multiplication - of $576$ and $241$. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|c|c|c|c|c|c|} -@@ -3774,7 +3774,7 @@ - Now the columns must be fixed by propagating the carry upwards. The resultant vector will have one extra dimension over the input vector which is - congruent to adding a leading zero digit. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -3830,7 +3830,7 @@ - the smaller input may not have more than $256$ digits if the Comba method is to be used. This is quite satisfactory for most applications since - $256$ digits would allow for numbers in the range of $0 \le x < 2^{7168}$ which, is much larger than most public key cryptographic algorithms require. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4129,7 +4129,7 @@ - of this system of equations has made Karatsuba fairly popular. In fact the cutoff point is often fairly low\footnote{With LibTomMath 0.18 it is 70 and 109 digits for the Intel P4 and AMD Athlon respectively.} - making it an ideal algorithm to speed up certain public key cryptosystems such as RSA and Diffie-Hellman. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4387,7 +4387,7 @@ - the algorithm can be faster than a baseline multiplication. However, the greater complexity of this algorithm places the cutoff point - (\textbf{TOOM\_MUL\_CUTOFF}) where Toom-Cook becomes more efficient much higher than the Karatsuba cutoff point. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4424,7 +4424,7 @@ - \caption{Algorithm mp\_toom\_mul} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4768,7 +4768,7 @@ - Now that algorithms to handle multiplications of every useful dimensions have been developed, a rather simple finishing touch is required. So far all - of the multiplication algorithms have been unsigned multiplications which leaves only a signed multiplication algorithm to be established. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -4875,7 +4875,7 @@ - For any $n$-digit input, there are ${{\left (n^2 + n \right)}\over 2}$ possible unique single precision multiplications required compared to the $n^2$ - required for multiplication. The following diagram gives an example of the operations required. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{tabular}{ccccc|c} - &&1&2&3&\\ -@@ -4903,7 +4903,7 @@ - The baseline squaring algorithm is meant to be a catch-all squaring algorithm. It will handle any of the input sizes that the faster routines - will not handle. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5052,7 +5052,7 @@ - mp\_word arrays. One array will hold the squares and the other array will hold the double products. With both arrays the doubling and - carry propagation can be moved to a $O(n)$ work level outside the $O(n^2)$ level. In this case, we have an even simpler solution in mind. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5242,7 +5242,7 @@ - The 100 digit halves will not be squared using Karatsuba, but instead using the faster Comba based squaring algorithm. If Karatsuba multiplication - were used instead, the 100 digit numbers would be squared with a slower Comba based multiplication. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5442,7 +5442,7 @@ - derive their own Toom-Cook squaring algorithm. - - \subsection{High Level Squaring} --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5705,7 +5705,7 @@ - is considerably faster than the straightforward $3m^2$ method. - - \subsection{The Barrett Algorithm} --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5865,7 +5865,7 @@ - In order to use algorithm mp\_reduce the value of $\mu$ must be calculated in advance. Ideally this value should be computed once and stored for - future use so that the Barrett algorithm can be used without delay. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5931,7 +5931,7 @@ - - From these two simple facts the following simple algorithm can be derived. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -5957,7 +5957,7 @@ - final result of the Montgomery algorithm. If $k > lg(n)$ and $0 \le x < n^2$ then the final result is limited to - $0 \le r < \lfloor x/2^k \rfloor + n$. As a result at most a single subtraction is required to get the residue desired. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|c|l|} -@@ -5987,7 +5987,7 @@ - and $k^2$ single precision additions. At this rate the algorithm is most certainly slower than Barrett reduction and not terribly useful. - Fortunately there exists an alternative representation of the algorithm. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6009,7 +6009,7 @@ - This algorithm is equivalent since $2^tn$ is a multiple of $n$ and the lower $k$ bits of $x$ are zero by step 2. The number of single - precision shifts has now been reduced from $2k^2$ to $k^2 + k$ which is only a small improvement. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{small} - \begin{center} - \begin{tabular}{|c|l|r|} -@@ -6042,7 +6042,7 @@ - Instead of computing the reduction on a bit-by-bit basis it is actually much faster to compute it on digit-by-digit basis. Consider the - previous algorithm re-written to compute the Montgomery reduction in this new fashion. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6100,7 +6100,7 @@ - The baseline Montgomery reduction algorithm will produce the residue for any size input. It is designed to be a catch-all algororithm for - Montgomery reductions. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6287,7 +6287,7 @@ - With this change in place the Montgomery reduction algorithm can be performed with a Comba style multiplication loop which substantially increases - the speed of the algorithm. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6526,7 +6526,7 @@ - \subsection{Montgomery Setup} - To calculate the variable $\rho$ a relatively simple algorithm will be required. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6627,7 +6627,7 @@ - The variable $n$ reduces modulo $n - k$ to $k$. By putting $q = \lfloor x/n \rfloor$ and $r = x \mbox{ mod } n$ - into the equation the original congruence is reproduced, thus concluding the proof. The following algorithm is based on this observation. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6740,7 +6740,7 @@ - of $x$ and $q$. The resulting algorithm is very efficient and can lead to substantial improvements over Barrett and Montgomery reduction when modular - exponentiations are performed. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6888,7 +6888,7 @@ - To setup the restricted Diminished Radix algorithm the value $k = \beta - n_0$ is required. This algorithm is not really complicated but provided for - completeness. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6928,7 +6928,7 @@ - Another algorithm which will be useful is the ability to detect a restricted Diminished Radix modulus. An integer is said to be - of restricted Diminished Radix form if all of the digits are equal to $\beta - 1$ except the trailing digit which may be any value. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -6990,7 +6990,7 @@ - In general the restricted Diminished Radix reduction algorithm is much faster since it has considerably lower overhead. However, this new - algorithm is much faster than either Montgomery or Barrett reduction when the moduli are of the appropriate form. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7081,7 +7081,7 @@ - \subsubsection{Unrestricted Setup} - To setup this reduction algorithm the value of $k = 2^p - n$ is required. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7153,7 +7153,7 @@ - that there will be value of $k$ that when added to the modulus causes a carry in the first digit which propagates all the way to the most - significant bit. The resulting sum will be a power of two. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7293,7 +7293,7 @@ - While this current method is a considerable speed up there are further improvements to be made. For example, the $a^{2^i}$ term does not need to - be computed in an auxilary variable. Consider the following equivalent algorithm. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7345,7 +7345,7 @@ - to be used when a small power of an input is required (\textit{e.g. $a^5$}). It is faster than simply multiplying $b - 1$ times for all values of - $b$ that are greater than three. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7465,7 +7465,7 @@ - computes the same exponentiation. A group of $k$ bits from the exponent is called a \textit{window}. That is it is a small window on only a - portion of the entire exponent. Consider the following modification to the basic left to right exponentiation algorithm. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7501,7 +7501,7 @@ - approach is to brute force search amongst the values $k = 2, 3, \ldots, 8$ for the lowest result. Table~\ref{fig:OPTK} lists optimal values of $k$ - for various exponent sizes and compares the number of multiplication and squarings required against algorithm~\ref{fig:LTOR}. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - \begin{tabular}{|c|c|c|c|c|c|} -@@ -7530,7 +7530,7 @@ - - Table~\ref{fig:OPTK2} lists optimal values of $k$ for various exponent sizes and compares the work required against algorithm {\ref{fig:KARY}}. - --\begin{figure}[here] -+\begin{figure}[h] - \begin{center} - \begin{small} - \begin{tabular}{|c|c|c|c|c|c|} -@@ -7552,7 +7552,7 @@ - \label{fig:OPTK2} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7603,7 +7603,7 @@ - value of $(1/a) \mbox{ mod }c$ is computed using the modular inverse (\textit{see \ref{sec;modinv}}). If no inverse exists the algorithm - terminates with an error. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7754,7 +7754,7 @@ - - \subsection{Barrett Modular Exponentiation} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7804,7 +7804,7 @@ - \caption{Algorithm s\_mp\_exptmod} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -7896,7 +7896,7 @@ - the two cases of $mode = 1$ and $mode = 2$ respectively. - - \begin{center} --\begin{figure}[here] -+\begin{figure}[h] - \includegraphics{pics/expt_state.ps} - \caption{Sliding Window State Diagram} - \label{pic:expt_state} -@@ -8163,7 +8163,7 @@ - Calculating $b = 2^a$ can be performed much quicker than with any of the previous algorithms. Recall that a logical shift left $m << k$ is - equivalent to $m \cdot 2^k$. By this logic when $m = 1$ a quick power of two can be achieved. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -8239,7 +8239,7 @@ - will be used. Let $x$ represent the divisor and $y$ represent the dividend. Let $q$ represent the integer quotient $\lfloor y / x \rfloor$ and - let $r$ represent the remainder $r = y - x \lfloor y / x \rfloor$. The following simple algorithm will be used to start the discussion. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -8335,7 +8335,7 @@ - At most the quotient approaches $2\beta$, however, in practice this will not occur since that would imply the previous quotient digit was too small. - - \subsection{Radix-$\beta$ Division with Remainder} --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -8378,7 +8378,7 @@ - \caption{Algorithm mp\_div} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -8782,7 +8782,7 @@ - Both addition and subtraction are performed by ``cheating'' and using mp\_set followed by the higher level addition or subtraction - algorithms. As a result these algorithms are subtantially simpler with a slight cost in performance. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -8913,7 +8913,7 @@ - multiplication algorithm. Essentially this algorithm is a modified version of algorithm s\_mp\_mul\_digs where one of the multiplicands - only has one digit. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9020,7 +9020,7 @@ - Like the single digit multiplication algorithm, single digit division is also a fairly common algorithm used in radix conversion. Since the - divisor is only a single digit a specialized variant of the division algorithm can be used to compute the quotient. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9186,7 +9186,7 @@ - such as the real numbers. As a result the root found can be above the true root by few and must be manually adjusted. Ideally at the end of the - algorithm the $n$'th root $b$ of an integer $a$ is desired such that $b^n \le a$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9255,7 +9255,7 @@ - factoring for example, can make use of random values as starting points to find factors of a composite integer. In this case the algorithm presented - is solely for simulations and not intended for cryptographic use. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9339,7 +9339,7 @@ - such that they are printable. While outputting as base64 may not be too helpful for human operators it does allow communication via non binary - mediums. - --\newpage\begin{figure}[here] -+\newpage\begin{figure}[h] - \begin{center} - \begin{tabular}{cc|cc|cc|cc} - \hline \textbf{Value} & \textbf{Char} & \textbf{Value} & \textbf{Char} & \textbf{Value} & \textbf{Char} & \textbf{Value} & \textbf{Char} \\ -@@ -9367,7 +9367,7 @@ - \label{fig:ASC} - \end{figure} - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9478,7 +9478,7 @@ - \subsection{Generating Radix-$n$ Output} - Generating radix-$n$ output is fairly trivial with a division and remainder algorithm. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9608,7 +9608,7 @@ - The most common approach (cite) is to reduce one input modulo another. That is if $a$ and $b$ are divisible by some integer $k$ and if $qa + r = b$ then - $r$ is also divisible by $k$. The reduction pattern follows $\left < a , b \right > \rightarrow \left < b, a \mbox{ mod } b \right >$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9634,7 +9634,7 @@ - greatest common divisors. The faster approach is based on the observation that if $k$ divides both $a$ and $b$ it will also divide $a - b$. - In particular, we would like $a - b$ to decrease in magnitude which implies that $b \ge a$. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9668,7 +9668,7 @@ - However, instead of factoring $b - a$ to find a suitable value of $p$ the powers of $p$ can be removed from $a$ and $b$ that are in common first. - Then inside the loop whenever $b - a$ is divisible by some power of $p$ it can be safely removed. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9713,7 +9713,7 @@ - The algorithms presented so far cannot handle inputs which are zero or negative. The following algorithm can handle all input cases properly - and will produce the greatest common divisor. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -9891,7 +9891,7 @@ - Linear Feedback Shift Registers (LFSR) tend to use registers with periods which are co-prime (\textit{e.g. the greatest common divisor is one.}). - Similarly in number theory if a composite $n$ has two prime factors $p$ and $q$ then maximal order of any unit of $\Z/n\Z$ will be $[ p - 1, q - 1] $. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -10060,7 +10060,7 @@ - factors of $p$ do not have to be known. Furthermore, if $(a, p) = 1$ then the algorithm will terminate when the recursion requests the - Jacobi symbol computation of $\left ( {1 \over a'} \right )$ which is simply $1$. - --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -10268,7 +10268,7 @@ - equation. - - \subsection{General Case} --\newpage\begin{figure}[!here] -+\newpage\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -10407,7 +10407,7 @@ - approximately $80\%$ of all candidate integers. The constant \textbf{PRIME\_SIZE} is equal to the number of primes in the test base. The - array \_\_prime\_tab is an array of the first \textbf{PRIME\_SIZE} prime numbers. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -10536,7 +10536,7 @@ - integers known as Carmichael numbers will be a pseudo-prime to all valid bases. Fortunately such numbers are extremely rare as $n$ grows - in size. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l} -@@ -10616,7 +10616,7 @@ - value must be equal to $-1$. The squarings are stopped as soon as $-1$ is observed. If the value of $1$ is observed first it means that - some value not congruent to $\pm 1$ when squared equals one which cannot occur if $n$ is prime. - --\begin{figure}[!here] -+\begin{figure}[!h] - \begin{small} - \begin{center} - \begin{tabular}{l}