Recently I was talking to a friend of mine who is doing his thesis on parallelizing software on the CELL processor (the one used in the PS3). When benchamrking they found a program that runs nearly four times faster if compiled with the -O3 option. I’ve always specified optimization flags in my projects, and I was aware that they speed up the code a lot, by I was surprised by that figure: four times faster, and without changing the source code!
So, I decided to write a blog post about it. First, let’s start with GCC basics. To compile a C++ file you usually type:
g++ -c file.cpp g++ -o file file.o
The first line compiles the source code int an object file (file.o), and the second performs the linking step.
To specify an optimization flag, you add -OX to the compiler, were X can be 0,1,2,3 or s; like this:
g++ -O3 -c file.cpp g++ -o file file.o
Ok, but what optimization flags to choose?
When debugging code, it is necessary to use -O0, which is the equivalent of passing no optimization flags. This is because optimizations confuse debuggers that have difficulties in single stepping your code. (Of course, if you want to debug your code with gdb, you’ll want to enable debugging data, by passing the -g option to the compiler)
When you are releasing your application, it is time to enable optimizations. My advice is to choose the optimization flags in this way:
- By default, use -O2. It makes your code run fast enough for most applications.
- If your application need to run very fast, use -O3. This increases the size of the executable file (for example, because it might perform loop unrolling), but it makes it run faster.
- If you need to minimize the executable size, use -Os plus the -s option that strips from the executable everything except what’s really necessary.