$Id: rsa.html,v 1.3 2004/09/23 19:52:11 dean Exp $ The The following table shows the improvements for signing 1024-bit RSA keys. The values are cycles per signature (lower is better). INT refers to the existing integer-only implementations in openssl-0.9.7c.
Note that on Opteron it makes much more sense to use a native 64-bit implementation when possible because the integer unit supports 128-bit multiplies and fast carry on 64-bit ints. However for the purposes of evaluating this patch for use in 32-bit libraries compiled for all SSE2 platforms it makes sense to compare the 32-bit number on Opteron. It can be seen that on all SSE2 platforms this new code yields improved results over the 32-bit integer code. Downloads
MiscellaneousNote that in the implementation, MMX registers are used rather than
XMM registers. This is because there is never any need for more than
64-bits per register, and better parallelism may be achieved on Pentium-M,
Opteron, and Efficēon which implement SSE2 using a pair of 64-bit
MMX units (whereas P4 uses a single 128-bit unit). Unfortunately,
There still remains other opportunities for improvement -- this patch
so far only modifies the innermost loop of It's possible a carry-saves representation of intermediates within the
bn library would yield much more improved throughputs. In a carry-saves
representation, carry propagation is postponed until an operation needs it
explicitly. An intermediate form uses twice the storage, the extra space
"saves" the carry that must still be propagated. An implementation of
the inner dean gaudet -- you can probably guess my email address if you need it |