The stack reuse trick
Not much has been written about code golf in general—let alone about golfing in C, of all languages. My contribution here is to share with the world an old but spicy technique that is, in my opinion, quite unique to C.
Disclaimer: Since this is code golf, no guarantee on portability or safety should be expected. The snippets here should work when compiled without optimisations for a 32-bit linux system on x86. You can compile 32-bit binaries on a 64-bit machine with GCC or Clang using the -m32
option.
Consider this simple program that echos a line from standard inputHow dangerous is writing to &s
? Since s
is a parameter of main
, it is stored on the stack before anything else. Since the stack grows towards lower addresses, the gets
call will not touch the part of the stack we care about. It may overwrite data stored on the stack before main
is called such as the frame pointer to the program entry point but this doesn’t matter since the program exits right after main
returns. Empirically, we are good if the input isn’t too long—a reasonable assumption because having to allocate memory makes a boring game of golf..
I claim that it can be shortened even further.
You should verify that this compiles and behaves as expected. As a fun puzzle, pause for a moment and think about why this works before reading on. Hint: the title is a spoiler.
Implicit function declarations
First of all, why does this even compile?
There exists an old language rule from the days of K&R C that deals with implicit declarations. That is, how a compiler should handle calls to an arbitrary function f
whose declaration cannot be found. Most people expect this to be a compile error, but the standard states that an implicit declaration of int f()
should be generated by defaultThis rule has been banished from the standard since C99. Nonetheless, GCC and clang continue to compile code that rely on this behavior (even on -std=c99
), which explains the warnings if you tried to compile the example..
Question: What does int f();
mean exactly?
Surprisingly, it is not a declaration for a function that takes no arguments and returns an int
. That would be int f(void);
. According to the standard, this instead declares a function with an unspecified but fixed list of parameters. Since parameter information is not provided by the programmer, it is not possible for the compiler to type-check the arguments in a call to such a functionA feature that worsens type safety for no gain has got to be legacy cruft. The existence of such a backwards rule can be traced all the way back to K&R C. More details can be found in this stack overflow answer..
The above rules explain some things about our example:
- Since functions have implicit declarations, we can use
puts
andgets
without the appropriate#include
Omitting header includes is one of the first tricks a new golfer learns. Now, we know why it is allowed.. - Without the prototypes in the header files, we are allowed to call
puts
with no arguments and our code will still compile.
This explains why the program compiles. But why does it behave correctly even with a missing argument to puts()
? As the semantics of such a function call is undefined, the most reliable way to figure out what happens during runtime is to investigate the generated binary.
i386 calling convention
Our answers lie in the calling convention followed by the compiler which can be found in section 2.2 of the i386 System V ABI document.
We will assume our arguments have 32-bit size types like int
or pointers. This is the default argument width for implicit declarationsThis means we can’t use functions that take char
arguments. In practice, it doesn’t matter since standard library functions that should take char
arguments are defined to take int
s instead.Example: int putchar(int c);
. The following steps are involved in a function call:
The caller shifts the stack pointer upwards.
The caller pushes the arguments onto the stack from left to right.
The function is
call
ed, which also pushes the return address onto the stack. The offset in step 1 is chosen such that the stack pointer is now properly alignedAn invariant maintained by the calling convention is that stack frames are aligned to 16-byte addresses..The callee pushes the old base pointer onto the stack and sets its base pointer
Execution continues within this new frame. The arguments are accessed through fixed offsets from the base pointer. For reference, this is the stack layout:
frame address content ... (may contain padding) caller ebp+12 arg 2 ebp+8 arg 1 -------------------------------------- (16-byte aligned) callee ebp+4 return address ebp base pointer of caller ...
During cleanup, the callee and caller take turns to shift the stack pointer such that it is restored to the address before any of this happened.
We can see this in the disassembly:
puts:
push ebp
mov ebp, esp
sub esp, 8
... ; expects argument at [ebp+8]
leave
ret
main:
; main's prologue
...
; gets(&s)
sub esp, 12
push eax ; push &s
call gets
add esp, 16 ; restore stack pointer
; puts()
call puts
; main's epilogue
...
Putting it together
We now have enough information to explain how it all works.
Notice that even though we did not call puts with any argument, both stack frames start at the same address (and so do their arguments at ebp+8
). This is a side effect of the alignment requirement. Since both gets
and puts
are called with fewer than four arguments, the addresses of their stack frames are rounded up to the same 16-byte boundary.
Between the two calls, no other stack frame is created at the same address. This means that argument given to gets
still exists on the stack by the time puts
is called. puts
ends up retrieving the value of the gets
parameter outside its lifetime since they overlap in memory, and this value is also the expected value &s
. Observationally, the stack is set up correctly for puts
even though the caller messed up!
Since stack allocations are predictable, we can reliably line up function calls to save bytes by reusing transient values that still exist on the stack. The gets
-puts
pairing is a common instance of this technique applied. Some others that I remember seeing in the wild include:
read
andwrite
gets
andatoi
scanf
andvprintf