diff --git a/manual/gf-complete.html b/manual/gf-complete.html
index 86de277..ed79e25 100644
--- a/manual/gf-complete.html
+++ b/manual/gf-complete.html
@@ -160,7 +160,7 @@ CONTENT 3
-4.1 Three Simple Command Line Tools: gf mult, gf div and gf add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
+4.1 Three Simple Command Line Tools: gf_mult, gf_div and gf_add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
4.2 Quick Starting Example #1: Simple multiplication and division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
4.3 Quick Starting Example #2: Multiplying a region by a constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
4.4 Quick Starting Example #3: Using w = 64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
@@ -231,7 +231,7 @@ CONTENT
3
7.4 Arguments to
"SPLIT" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
7.5 Arguments to
"GROUP" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
7.6 Considerations with
"COMPOSITE" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
-7.7
"CARRY FREE" and the Primitive Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
+7.7
"CARRY_FREE" and the Primitive Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
7.8 More on Primitive Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .
31
@@ -426,7 +426,7 @@ defines some randomnumber generators to help test the programs. The randomnumber
of All" random number generator [Mar94] which we've selected because it has no patent issues. gf_unit and
-gf time use these random number generators.
+gf_time use these random number generators.
- gf_int.h: This is an internal header file that the various source files use. This is not intended for applications to
include.
- config.xx and stamp-h1 are created by autoconf, and should be ignored by applications.
@@ -457,7 +457,7 @@ The following are tools to help you with Galois Field arithmetic, and with the l
detail elsewhere in this manual.
- gf_mult.c, gf_ div.c and gf_ add: Command line tools to do multiplication, division and addition by single numbers
- gf_time.c: A program that times the procedures for given values of w and implementation options
-- time tool.sh: A shell script that helps perform rough timings of the various multiplication, division and region
+
- time_tool.sh: A shell script that helps perform rough timings of the various multiplication, division and region
operations in GF-Complete
- gf_methods.c: A program that enumerates most of the implementation methods supported by GF-Complete
- gf_poly.c: A program to identify irreducible polynomials in regular and composite Galois Fields
@@ -652,7 +652,7 @@ gf.multiply_region.w32 (&gf, r1, r2, a, 16, 0);
That last argument specifies whether to simply place the product into r2 or to XOR it with the contents that are already
in r2. Zero means to place the product there. When we run it, it prints the results of the multiply_region.w32 in
-hexadecimal. Again, you can verify it using gf mult:
+hexadecimal. Again, you can verify it using gf_mult:
UNIX> gf_example_2 4
12 * 2 = 11
@@ -917,7 +917,7 @@ memory consumption and their rough performance. The performance tests are on an
3.40 GHz, and are included solely to give a flavor of performance on a standard microprocessor. Some processors
will be faster with some techniques and others will be slower, so we only put numbers in so that you can ballpark it.
For other values of
w between 1 and 31, we use table lookup when w ≤ 8, discrete logarithms when w ≤ 16 and
-"Bytwop" for w ≤ 32.
+"Bytwo
p" for w ≤ 32.
With SSE
@@ -972,15 +972,15 @@ For other values of w between 1 and 31, we use table lookup when w X
32 | 16 | 2,135 |
-32 | 4K | Bytwop | 19 | Split Table (32,4) |
+32 | 4K | Bytwop | 19 | Split Table (32,4) |
4 | 4 | 1,149 |
-64 | 16K | Bytwop | 9 | Split Table (64,4) |
+64 | 16K | Bytwop | 9 | Split Table (64,4) |
8 | 8 | 987 |
-128 | 64K | Bytwop | 1.4 | Split Table (128,4) |
+128 | 64K | Bytwop | 1.4 | Split Table (128,4) |
16 | 8 | 833 |
@@ -1194,30 +1194,30 @@ larger w than "TABLE." If the polynomial is not primitive (see s
an implementation. In that case, gf_init_hard() or create_gf_from_argv() will fail
-- "LOG ZERO:" Discrete logarithm tables which include extra room for zero entries. This more than doubles
+
- "LOG_ZERO:" Discrete logarithm tables which include extra room for zero entries. This more than doubles
the memory consumption to remove an if statement (please see [GMS08] or The Paper for more description). It
doesn’t really make a huge deal of difference in performance
-- "LOG ZERO EXT:" This expends even more memory to remove another if statement. Again, please see The
-Paper for an explanation. As with "LOG ZERO," the performance difference is negligible
+- "LOG_ZERO_EXT:" This expends even more memory to remove another if statement. Again, please see The
+Paper for an explanation. As with "LOG_ZERO," the performance difference is negligible
- "SHIFT:" Implementation straight from the definition of Galois Field multiplication, by shifting and XOR-ing,
then reducing the product using the polynomial. This is slooooooooow, so we don’t recommend you use it
-- "CARRY FREE:" This is identical to "SHIFT," however it leverages the SSE instruction PCLMUL to perform
+
- "CARRY_FREE:" This is identical to "SHIFT," however it leverages the SSE instruction PCLMUL to perform
carry-freemultiplications in single instructions. As such, it is the fastest way to perform multiplication for large
values of w when that instruction is available. Its performance depends on the polynomial used. See The Paper
for details, and see section 7.7 below for the speedups available when w = 16 and w = 32 if you use a different
polynomial than the default one
-- "BYTWO p:" This implements multiplication by successively multiplying the product by two and selectively
+
- "BYTWO_p:" This implements multiplication by successively multiplying the product by two and selectively
XOR-ing the multiplicand. See The Paper for more detail. It can leverage Anvin’s optimization that multiplies
64 and 128 bits of numbers in GF(2w) by two with just a few instructions. The SSE version requires SSE2
-- "BYTWO b:" This implements multiplication by successively multiplying the multiplicand by two and selectively
+
- "BYTWO_b:" This implements multiplication by successively multiplying the multiplicand by two and selectively
XOR-ing it into the product. It can also leverage Anvin's optimization, and it has the feature that when
you're multiplying a region by a very small constant (like 2), it can terminate the multiplication early. As such,
if you are multiplying regions of bytes by two (as in the Linux RAID-6 Reed-Solomon code [Anv09]), this is
@@ -1269,7 +1269,7 @@ In order to specify the base field, put appropriate flags after specifying k
and after that, you may continue making specifications for the composite field. This process can be continued
for multiple layers of "COMPOSITE." As an example, the following multiplies 1000000 and 2000000
in GF((216)2), where the base field uses BYTWO_p for multiplication:
-./gf mult 1000000 2000000 32 -m COMPOSITE 2 -m BYTWO p - -
+./gf_mult 1000000 2000000 32 -m COMPOSITE 2 -m BYTWO_p - -
In the above example, the red text applies to the base field, and the black text applies to the composite field.
Composite fields have two defining polynomials - one for the composite field, and one for the base field. Thus, if
@@ -1278,7 +1278,7 @@ form x2+sx+1, where s is an element of GF(2k). To
example below, we multiply 20000 and 30000 in GF((28)2) , setting s to three, and using x8+x4+x3+x2+1
as the polynomial for the base field:
-./gf mult 20000 30000 16 -m COMPOSITE 2 -p 0x11d - -p 0x3 -
+./gf_mult 20000 30000 16 -m COMPOSITE 2 -p 0x11d - -p 0x3 -
If you use composite fields, you should consider using "ALTMAP" as well. The reason is that the region
operations will go much faster. Please see section 7.6.
@@ -1340,13 +1340,13 @@ multiplication techniques which can leverage SSE instructions and which versions
"SPLIT" | - | Yes | SSSE3 | Only when the second argument equals 4. |
-"SPLIt" | - | Yes | SSE4 | When w = 64 and not using "ALTMAP". |
+"SPLIT" | - | Yes | SSE4 | When w = 64 and not using "ALTMAP". |
-"BYTWO p" | - | Yes | SSE2 | |
+"BYTWO_p" | - | Yes | SSE2 | |
-"BYTWO p" | - | Yes | SSE2 | |
+"BYTWO_p" | - | Yes | SSE2 | |
Table 2: Multiplication techniques which can leverage SSE instructions when they are available.
@@ -1425,12 +1425,12 @@ listed. If multiple region options are required, they should be specified indepe
and independent options for command-line tools and create_gf_from_argv()).
-6.2    Determining Supported Techniques with gf methods
+6.2    Determining Supported Techniques with gf_methods
The program gf_methods prints a list of supported methods on standard output. It is called as follows:
-
./gf methods w -BADC -LUMDRB
+./gf_methods w -BADC -LUMDRB
The first argument is w , which may be any legal value of w . The second argument has the following flags:
@@ -1583,7 +1583,7 @@ The performance of "Region-By-Zero" and "Region-By-One" will not change from tes
the same calls for these. "Region-By-Zero" with "XOR: 1" does nothing except set up the tests. Therefore, you may
use it as a control.
-6.3.1       time tool.sh
+6.3.1       time_tool.sh
Finally, the shell script time_tool.sh makes a bunch of calls to gf_time to give a rough estimate of performance. It is
called as follows:
@@ -1637,7 +1637,7 @@ error may be minimized.
6     THE DEFAULTS 23
-6.3.2       An example of gf methods and time tool.sh
+6.3.2       An example of gf_methods and time_tool.sh
Let's give an example of how some of these components fit together. Suppose we want to explore the basic techniques
in GF(232). First, let's take a look at what gf_methods suggests as "basic" methods:
@@ -1656,7 +1656,7 @@ UNIX>
-You'll note, this is on my old Macbook Pro, which doesn't support (PCLMUL), so "CARRY FREE" is not included
+You'll note, this is on my old Macbook Pro, which doesn't support (PCLMUL), so "CARRY_FREE" is not included
as an option. Now, let's run the unit tester on these to make sure they work, and to see their memory consumption:
@@ -1739,7 +1739,7 @@ which is why we don't use "
-m SPLIT 32 4 -r ALTMAP -."
Test question: Given the numbers above, it would appear that "COMPOSITE" yields the fastest performance of
single multiplication, while "SPLIT 32 4" yields the fastest performance of region multiplication. Should I use two
-gf t's in my application – one for single multiplication that uses "COMPOSITE," and one for region multiplication
+gf_t's in my application – one for single multiplication that uses "COMPOSITE," and one for region multiplication
that uses "SPLIT 32 4?"
The answer to this is "no." Why? Because composite fields are different from the "standard" fields, and if you mix
@@ -1780,7 +1780,7 @@ void *scratch_memory);
The arguments mult type, region type and divide type allow for the same specifications as above, except the
-types are integer constants defined in gf complete.h:
+types are integer constants defined in gf_complete.h:
typedef enum {GF_MULT_DEFAULT,
GF_MULT_SHIFT
@@ -2044,26 +2044,26 @@ The performance difference using
"ALTMAP" can be significant:
- gf time 16 G 0 1048576 100 -m SPLIT 16 4 - | Speed = 8,389 MB/s |
+ gf_time 16 G 0 1048576 100 -m SPLIT 16 4 - | Speed = 8,389 MB/s |
-gf time 16 G 0 1048576 100 -m SPLIT 16 4 -r ALTMAP - | Speed = 8,389 MB/s |
+gf_time 16 G 0 1048576 100 -m SPLIT 16 4 -r ALTMAP - | Speed = 8,389 MB/s |
-gf time 32 G 0 1048576 100 -m SPLIT 32 4 - | Speed = 5,304 MB/s |
+gf_time 32 G 0 1048576 100 -m SPLIT 32 4 - | Speed = 5,304 MB/s |
-gf time 32 G 0 1048576 100 -m SPLIT 32 4 -r ALTMAP - | Speed = 7,146 MB/s |
+gf_time 32 G 0 1048576 100 -m SPLIT 32 4 -r ALTMAP - | Speed = 7,146 MB/s |
-gf time 64 G 0 1048576 100 -m SPLIT 64 4 - | Speed = 2,595 MB/s |
+gf_time 64 G 0 1048576 100 -m SPLIT 64 4 - | Speed = 2,595 MB/s |
-gf time 64 G 0 1048576 100 -m SPLIT 64 4 -r ALTMAP - | Speed = 3,436 MB/s |
+gf_time 64 G 0 1048576 100 -m SPLIT 64 4 -r ALTMAP - | Speed = 3,436 MB/s |
@@ -2179,15 +2179,15 @@ region(), rather than simply calling multiply() on every word in the
-gf time 32 G 0 10240 10240 -m COMPOSITE 2 - -
+gf_time 32 G 0 10240 10240 -m COMPOSITE 2 - -
Speed = 322 MB/s |
-gf time 32 G 0 10240 10240 -m COMPOSITE 2 - -r ALTMAP -
+ | gf_time 32 G 0 10240 10240 -m COMPOSITE 2 - -r ALTMAP -
Speed = 3,368 MB/s |
-gf time 32 G 0 10240 10240 -m COMPOSITE 2 -m SPLIT 16 4 -r ALTMAP - -r ALTMAP -
+gf_time 32 G 0 10240 10240 -m COMPOSITE 2 -m SPLIT 16 4 -r ALTMAP - -r ALTMAP -
Speed = 3,925 MB/s |
@@ -2207,10 +2207,10 @@ as fast. The difference is the inlining of multiplication in the base field when
- gf time 8 M 0 1048576 100 - Speed = 501 Mega-ops/s |
- gf time 8 M 0 1048576 100 -m SPLIT 8 4 - Speed = 439 Mega-ops/s |
- gf time 8 M 0 1048576 100 -m COMPOSITE 2 - - Speed = 207 Mega-ops/s |
- gf time 8 M 0 1048576 100 -m COMPOSITE 2 -m SPLIT 8 4 - - Speed = 77 Mega-ops/s |
+ gf_time 8 M 0 1048576 100 - Speed = 501 Mega-ops/s |
+ gf_time 8 M 0 1048576 100 -m SPLIT 8 4 - Speed = 439 Mega-ops/s |
+ gf_time 8 M 0 1048576 100 -m COMPOSITE 2 - - Speed = 207 Mega-ops/s |
+ gf_time 8 M 0 1048576 100 -m COMPOSITE 2 -m SPLIT 8 4 - - Speed = 77 Mega-ops/s |
@@ -2235,17 +2235,17 @@ region operations (641 MB/s):
-gf time 128 G 0 1048576 100 -m COMPOSITE 2 -m COMPOSITE 2 -m COMPOSITE 2
+gf_time 128 G 0 1048576 100 -m COMPOSITE 2 -m COMPOSITE 2 -m COMPOSITE 2
-m SPLIT 16 4 -r ALTMAP - -r ALTMAP - -r ALTMAP - -r ALTMAP -
Please see section 7.8.1 for a discussion of polynomials in composite fields.
-7.7       "CARRY FREE" and the Primitive Polynomial
+7.7       "CARRY_FREE" and the Primitive Polynomial
-If your machine supports the PCLMUL instruction, then we leverage that in "CARRY FREE." This implementation
+If your machine supports the PCLMUL instruction, then we leverage that in "CARRY_FREE." This implementation
first performs a carry free multiplication of two w-bit numbers, which yields a 2w-bit number. It does this with
one PCLMUL instruction. To reduce the 2w-bit number back to a w-bit number requires some manipulation of the
polynomial. As it turns out, if the polynomial has a lot of contiguous zeroes following its leftmost one, the number of
@@ -2260,9 +2260,9 @@ You can see the difference in performance:
-gf time 32 M 0 1048576 100 -m CARRY FREE - | Speed = 48 Mega-ops/s |
+gf_time 32 M 0 1048576 100 -m CARRY_FREE - | Speed = 48 Mega-ops/s |
-gf time 32 M 0 1048576 100 -m CARRY FREE -p 0xc5 - | Speed = 81 Mega-ops/s |
+gf_time 32 M 0 1048576 100 -m CARRY_FREE -p 0xc5 - | Speed = 81 Mega-ops/s |
@@ -2270,8 +2270,8 @@ You can see the difference in performance:
This is relevant for w = 16 and w = 32, where the "standard" polynomials are sub-optimal with respect to
-"CARRY FREE." For w = 16, the polynomial 0x1002d has the desired property. It’s less important, of course,
-with w = 16, because "LOG" is so much faster than CARRY FREE.
+"CARRY_FREE." For w = 16, the polynomial 0x1002d has the desired property. It’s less important, of course,
+with w = 16, because "LOG" is so much faster than CARRY_FREE.
7.8   More on Primitive Polynomials
@@ -2383,7 +2383,7 @@ GF-Complete will successfully select a default polynomial in the following compo
6     FURTHER INFORMATION ON OPTIONS AND ALGORITHMS 33
-7.8.3 The Program gf poly for Verifying Irreducibility of Polynomials
+7.8.3 The Program gf_poly for Verifying Irreducibility of Polynomials
The program gf_poly uses the Ben-Or algorithm[GP97] to determine whether a polynomial with coefficients in GF(2w )
is reducible. Its syntax is:
@@ -2640,8 +2640,8 @@ stored in 16 16-byte regions.
7.9.2   Alternate mappings with "COMPOSITE"
With "COMPOSITE," the alternate mapping divides the middle region in half. The lower half of each word is stored
-in the first half of the middle region, and the higher half is stored in the second half. To illustrate, gf example 6
-performs the same example as gf example 5, except it is using "COMPOSITE" in GF((216)2), and it is multiplying
+in the first half of the middle region, and the higher half is stored in the second half. To illustrate, gf_example_6
+performs the same example as gf_example_5, except it is using "COMPOSITE" in GF((216)2), and it is multiplying
a region of 120 bytes rather than 60. As before, the pointers are not aligned on 16-bit quantities, so the region is broken
into three regions of 4 bytes, 96 bytes, and 20 bytes. In the first and third region, each consecutive four byte word is a
word in GF(232). For example, word 0 is 0x562c640b, and word 25 is 0x46bc47e0. In the middle region, the low two
@@ -2847,14 +2847,14 @@ section 7.1.
- MOA_Random_W() in gf_rand.h: Creates a random w-bit number, where w ≤ 32.
- MOA_Seed() in gf_rand.h: Sets the seed for the random number generator.
- gf_errno in gf_complete.h: This is to help figure out why an initialization call failed. See section 6.1.
-- gf_create_gf_from_argv() in gf method.h: Creates a gf t using C style argc/argv. See section 6.1.1.
+- gf_create_gf_from_argv() in gf_method.h: Creates a gf_t using C style argc/argv. See section 6.1.1.
- gf_division_type_t in gf_complete.h: the different ways to specify division when using gf_init_hard(). See
section 6.4.
- gf_error() in gf_complete.h: This prints out why an initialization call failed. See section 6.1.
-- gf_extract in gf_complete.h: This is the data type of extract_word() in a gf t. See section 7.9 for an example
+
- gf_extract in gf_complete.h: This is the data type of extract_word() in a gf_t. See section 7.9 for an example
of how to use extract word().
-
+
@@ -3028,7 +3028,7 @@ composite field too. See 7.8.2 for the fields where GF-Complete will support def
explanation
-- "ALTMAP" is confusing. We agree. Please see section 7.9 for more explanation.
+
- "ALTMAP" is confusing. We agree. Please see section 7.9 for more explanation.
- I used "ALTMAP" and it doesn't appear to be functioning correctly. With 7.9, the size of the region and
its alignment both matter in terms of how "ALTMAP" performs multiply_region(). Please see section 7.9 for
@@ -3065,7 +3065,7 @@ per second.
As would be anticipated, the inlined operations (see section 7.1) outperform the others. Additionally, in all
cases with the exception of w = 32, the defaults are the fastest performing implementations. With w = 32,
-"CARRY FREE" is the fastest with an alternate polynomial (see section 7.7). Because we require the defaults to
+"CARRY_FREE" is the fastest with an alternate polynomial (see section 7.7). Because we require the defaults to
use a "standard" polynomial, we cannot use this implementation as the default.
11.2   Divide()
@@ -3126,9 +3126,9 @@ For these tables, we performed 1GB worth of multiply_region() calls for a
-m TABLE (Default) - | 11879.909 |
-m TABLE -r CAUCHY - | 9079.712 |
--m BYTWO b - | 5242.400 |
--m BYTWO p - | 4078.431 |
--m BYTWO b -r NOSSE - | 3799.699 |
+-m BYTWO_b - | 5242.400 |
+-m BYTWO_p - | 4078.431 |
+-m BYTWO_b -r NOSSE - | 3799.699 |
-m TABLE -r QUAD - | 3014.315 |
-m TABLE -r DOUBLE - | 2253.627 |
@@ -3138,7 +3138,7 @@ For these tables, we performed 1GB worth of multiply_region() calls for a
m SHIFT - | 157.749 |
--m CARRY FREE - | 86.202 |
+-m CARRY_FREE - | 86.202 |
@@ -3188,27 +3188,27 @@ of Computational Mathematics, pages 346–361. Springer Verlag, 1997.
-m SPLIT 8 4 (Default) | 13279.146 |
-m COMPOSITE 2 - -r ALTMAP - | 5516.588 |
-m TABLE -r CAUCHY - | 4968.721 |
-
-m BYTWO b - | 2656.463 |
+
-m BYTWO_b - | 2656.463 |
-m TABLE -r DOUBLE - | 2561.225 |
-m TABLE - | 1408.577 |
-
-m BYTWO b -r NOSSE - | 1382.409 |
-
-m BYTWO p - | 1376.661 |
-
-m LOG ZERO EXT - | 1175.739 |
-
-m LOG ZERO - | 1174.694 |
+
-m BYTWO_b -r NOSSE - | 1382.409 |
+
-m BYTWO_p - | 1376.661 |
+
-m LOG_ZERO_EXT - | 1175.739 |
+
-m LOG_ZERO - | 1174.694 |
-m LOG - | 997.838 |
-m SPLIT 8 4 -r NOSSE - | 885.897 |
-
-m BYTWO p -r NOSSE - | 589.520 |
+
-m BYTWO_p -r NOSSE - | 589.520 |
-m COMPOSITE 2 - - | 327.039 |
-m SHIFT - | 106.115 |
-
-m CARRY FREE - | 104.299 |
+
-m CARRY_FREE - | 104.299 |
@@ -3272,14 +3272,14 @@ Practice & Experience, 27(9):995-1012, September 1997.
-m SPLIT 8 8 - | 2163.993 |
-m SPLIT 16 4 -r NOSSE - | 1148.810 |
-m LOG - | 1019.896 |
--m LOG ZERO - | 1016.814 |
--m BYTWO b - | 738.879 |
+-m LOG_ZERO - | 1016.814 |
+-m BYTWO_b - | 738.879 |
-m COMPOSITE 2 - - | 596.819 |
--m BYTWO p - | 560.972 |
+-m BYTWO_p - | 560.972 |
-m GROUP 4 4 - | 450.815 |
--m BYTWO b -r NOSSE - | 332.967 |
--m BYTWO p -r NOSSE - | 249.849 |
--m CARRY FREE - | 111.582 |
+-m BYTWO_b -r NOSSE - | 332.967 |
+-m BYTWO_p -r NOSSE - | 249.849 |
+-m CARRY_FREE - | 111.582 |
-m SHIFT - | 95.813 |
@@ -3321,21 +3321,21 @@ of the Association for Computing Machinery, 36(2):335-348, April 1989.
-m SPLIT 32 4 (Default)
-m COMPOSITE 2 -m SPLIT 16 4 -r ALTMAP - -r ALTMAP -
-m COMPOSITE 2 - -r ALTMAP -
--m SPLIT 8 8
--m SPLIT 32 8
--m SPLIT 32 16
+-m SPLIT 8 8 -
+-m SPLIT 32 8 -
+-m SPLIT 32 16 -
-m SPLIT 8 8 -r CAUCHY
-m SPLIT 32 4 -r NOSSE
--m CARRY FREE -p 0xc5
+-m CARRY_FREE -p 0xc5
-m COMPOSITE 2 -
--m BYTWO b
--m BYTWO p
--m GROUP 4 8
--m GROUP 4 4
--m CARRY FREE
--m BYTWO b -r NOSSE
--m BYTWO p -r NOSSE
--m SHIFT
+-m BYTWO_b -
+-m BYTWO_p -
+-m GROUP 4 8 -
+-m GROUP 4 4 -
+-m CARRY_FREE -
+-m BYTWO_b -r NOSSE -
+-m BYTWO_p -r NOSSE -
+-m SHIFT -
@@ -3382,16 +3382,16 @@ of the Association for Computing Machinery, 36(2):335-348, April 1989.
-m COMPOSITE 2 - -r ALTMAP -
-m SPLIT 64 16 -
-m SPLIT 64 8 -
--m CARRY FREE -
+-m CARRY_FREE -
-m SPLIT 64 4 -r NOSSE -
-m GROUP 4 4 -
-m GROUP 4 8 -
--m BYTWO b -
--m BYTWO p -
+-m BYTWO_b -
+-m BYTWO_p -
-m SPLIT 8 8 -
--m BYTWO p -r NOSSE -
+-m BYTWO_p -r NOSSE -
-m COMPOSITE 2 - -
--m BYTWO b -r NOSSE -
+-m BYTWO_b -r NOSSE -
-m SHIFT -
@@ -3446,17 +3446,17 @@ of the Association for Computing Machinery, 36(2):335-348, April 1989.
--m SPLIT 128 4 -r ALTMAP-
--m COMPOSITE 2 -m SPLIT 64 4 -r ALTMAP - -r ALTMAP-
--m COMPOSITE 2 - -r ALTMAP-
--m SPLIT 128 8 (Default)-
--m CARRY FREE -
+-m SPLIT 128 4 -r ALTMAP -
+-m COMPOSITE 2 -m SPLIT 64 4 -r ALTMAP - -r ALTMAP -
+-m COMPOSITE 2 - -r ALTMAP -
+-m SPLIT 128 8 (Default) -
+-m CARRY_FREE -
-m SPLIT 128 4 -
-m COMPOSITE 2 -
-m GROUP 4 8 -
-m GROUP 4 4 -
--m BYTWO p -
--m BYTWO b -
+-m BYTWO_p -
+-m BYTWO_b -
-m SHIFT -
|