6.1. The /etc/profile file

The /etc/profile file contains system wide environment stuff and startup programs. All customizations that you put in this file will apply for the entire environment variable on your system, so putting optimization flags in this file is a good choice. To squeeze the most performance from your x86 programs, you can use full optimization when compiling with the -O9 flag. Many programs contain -O2 in the Makefile. -O9 is the highest level of optimization. It will increase the size of what it produces, but it runs faster.

Please Note it is not always true that the -O9 flag will make the best performance for your processor. If you have an x686 and above processor, surely, but below x686, not necessarily.

When compiling, use the -fomit-frame-pointer switch for any kind of processor you may have. This will use the stack for accessing variables. Unfortunately, debugging is almost impossible with this option. You can also use the -mcpu=cpu_type and -march=cpu_type switch to optimize the program for the CPU listed to the best of GCC's ability. However, the resulting code will only be run able on the indicated CPU or higher.

The optimization options apply only when we compile and install a new program in our server. These optimizations don't play any role in our Linux base system; it just tells our compiler to optimize the new programs that we will install with the optimization flags we have specified in the /etc/profile file.

Below are the optimization flags that we recommend you put in your /etc/profile file depending on your CPU architecture.

Recommended optimization flags

  1. For CPU i686 or PentiumPro, Pentium II, Pentium III In the /etc/profile file, put this line for a PentiumPro, Pentium II and III Pro Processor family:
    
               CFLAGS=-O9 -funroll-loops -ffast-math -malign-double -mcpu=pentiumpro -march=pentiumpro -fomit-frame-pointer -fno-exceptions
                   

    For CPU i586 or Pentium: In the /etc/profile file, put this line for a Pentium Processor family:
    
               CFLAGS=-O3 -march=pentium -mcpu=pentium -ffast-math -funroll-loops -fomit-frame-pointer -fforce-mem -fforce-addr -malign-double -fno-exceptions
                   

    For CPU i486: In the /etc/profile file, put this line for a i486 Processor family:
    
               CFLAGS=-O3  -funroll-all-loops -malign-double -mcpu=i486 -march=i486 -fomit-frame-pointer -fno-exceptions
                   

  2. Now after the selection of your CPU settings -i686, i586, or i486 a bit further down in the /etc/profile file, add CFLAGS LANG LESSCHARSET to the export line:
    
               export PATH PS1 HOSTNAME HISTSIZE HISTFILESIZE USER LOGNAME MAIL INPUTRC CFLAGS LANG LESSCHARSET
                   

  3. Log out and log back in; after this, the new CFLAGS environment variable is set, and software and other configure tool will recognize that. Pentium Pro/II/III optimizations will only work with egcs or pgcc compilers. The egcs compiler is already installed on your Server by default so you don't need to worry about it.

Below is the explanation of the different optimization options we use:

-funroll-loops

The -funroll-loops optimization option will perform the optimization of loop unrolling and will do it only for loops whose number of iterations can be determined at compile time or run time.

-funroll-all-loops

The -funroll-all-loops optimization option will also perform the optimization of loop unrolling and is done for all loops.

-ffast-math

The -ffast-math optimization option will allow the GCC compiler, in the interest of optimizing code for speed, to violate some ANSI or IEEE rules/specifications.

-malign-double

The -malign-double optimization option will control whether the GCC compiler aligns double, long double, and long long variables on a two-word boundary or a one-word boundary. This will produce code that runs somewhat faster on a Pentium at the expense of more memory.

-mcpu=cpu_type

The -mcpu=cpu_type optimization option will set the default CPU to use for the machine type when scheduling instructions.

-fforce-mem

The -fforce-mem optimization option will produce better code by forcing memory operands to be copied into registers before doing arithmetic on them and by making all memory references potential common subexpressions.

-fforce-addr

The -fforce-addr optimization option will produce better code by forcing memory address constants to be copied into registers before doing arithmetic on them.

-fomit-frame-pointer

The -fomit-frame-pointer optimization option, one of the most interesting, will allow the program to not keep the frame pointer in a register for functions that don't need one. This avoids the instructions to save, set up and restores frame pointers; it also makes an extra register available in many functions and makes debugging impossible on most machines.

Important: All future optimizations that we will describe in this book refer by default to a Pentium II/III CPU family. So you must, if required, adjust the compilation flags for your specific CPU processor type in the /etc/profile file and also during your compilation time.