restrict, yes it does make a difference Friday 22nd June 2007

restrict is a new keyword in C99 (ISO/IEC 9899:1999) designed to enable the compiler to make more optimizations in response to programmers’ guarantees. You can read the full meaning of the keyword in the standard but an oversimplification would be that if a pointer is marked as restrict then it is a guarantee that the objects referred to through that pointer are not accessible in any other way in the block containing the pointer’s scope.

Here’s an example. Suppose we have a function that takes a pointer to two ints which are consecutive members of the Fibonacci sequence and the function is required to change the values of the two ints to the next distinct pair of numbers in the sequence. In other words, if the initial terms are A and B, with A being the later member of the sequence, then the next two terms are A + B and (A + B) + A = 2A + B. You might do this as follows:

void fibincr2(int *a, int *b)
{
    *a += *b; // *a contains A + B
    *b -= *a; // *b contains -A
    *a -= *b; // *a contains 2A + B
    *b += *a; // *b contains A + B
}

Here is the assembly generated by gcc -O3 -S -std=c99 -pedantic -Wall -fomit-frame-pointer with my comments added. I’ve taken A and B to be the initial values of *a and *b and put the value of the result of each instruction in the right hand column in terms of A and B.

Note where we reload eax from *a despite the fact that eax was stored to *a earlier and eax was not change in the intervening code.

fibincr2:
    pushl   %ebx            # Save ebx register
    movl    8(%esp), %ebx   # ebx is a (type int*)
    movl    12(%esp), %ecx  # ecx is b
    movl    (%ebx), %eax    # eax = *a      A
    addl    (%ecx), %eax    # eax += *b     A + B
    movl    %eax, (%ebx)    # *a = eax      A + B
    movl    (%ecx), %edx    # edx = *b      B
    subl    %eax, %edx      # edx -= eax    -A
    movl    %edx, (%ecx)    # *b = edx      -A
    movl    (%ebx), %eax    # eax = *a      A + B (so long as a and b are distinct)
    subl    %edx, %eax      # eax -= edx    2A + B
    movl    %eax, (%ebx)    # *a = eax      2A + B
    addl    %eax, (%ecx)    # *b += eax     A + B
    popl    %ebx            # restore ebx register
    ret

Of course, this all breaks down when a and b are pointing at the same int, but then the function is useless in this case so we may as well state this.

void fibincr2(int * restrict a, int * restrict b)
{
    *a += *b; // *a contains A + B
    *b -= *a; // *b contains -A
    *a -= *b; // *a contains 2A + B
    *b += *a; // *b contains A + B
}

And now the new assembly:

fibincr2:
    pushl   %ebx            # Save ebx register
    movl    8(%esp), %ebx   # ebx is a (type int*)
    movl    12(%esp), %ecx  # ecx is b
    movl    (%ebx), %eax    # eax = *a      A
    movl    %eax, %edx      # edx = eax     A
    negl    %eax            # eax = -eax    -A
    addl    (%ecx), %edx    # edx += *b     A + B
    movl    %eax, (%ecx)    # *b = eax      -A
    subl    %eax, %edx      # edx -= eax    2A + B
    addl    %edx, (%ecx)    # *b += edx     A + B
    movl    %edx, (%ebx)    # *a = edx      2A + B
    popl    %ebx            # restore ebx register
    ret

Note that now the pointers are marked as restrict the compiler no longer has to generate a read for the *a after write to *b if it already has the result of a previous read (or write to) *a in a register, as the compiler now knows that the write to *b cannot have affected *a.

So previously, ignoring the initial load of parameters, we had six “mov” (one register only), two “add” (both involving memory) and two “sub” (both register only).

After adding restrict we now have four “mov” (one register only), two “add” (both involving memory), one “sub” and one “neg” (both register only). In other words we’ve save two memory access instructions in a ten instruction algorithm. Go us.

Of course, in this case, the use of a temporary would produce even more efficient code.

void fibincr2(int * restrict a, int * restrict b)
{
    int c = *a;   //  c contains A
    *a += c + *b; // *a contains 2A + B
    *b += c;      // *b contains A + B
}
fibincr2:
    pushl   %ebx               # save ebx register
    movl    8(%esp), %ebx      # ebx is a (type int*)
    movl    12(%esp), %edx     # edx is b
    movl    (%ebx), %ecx       # ecx = *a           A
    leal    (%ecx,%ecx), %eax  # eax = ecx + ecx    2A
    addl    (%edx), %eax       # eax += *b          2A + B
    addl    %ecx, (%edx)       # *b += ecx          A + B
    movl    %eax, (%ebx)       # *a = eax           2A + B
    popl    %ebx               # restore ebx register
    ret

Two “mov”, two “add” and a “lea”. Go us.

Addendum
The example function and original algorithm was chosen for this entry because it exhibits behaviour that is affected by using the restrict keyword and because the function performs a readily comprehensible action that isn’t stretching the bounds of “plausibly useful” too much. The final algorithm shows that this particular problem can be solved in a more efficient way, but I was struggling to find an example that was succinct enough without being overly artificial.

Comments are closed.