PGI Compiler Suite reference card
PGI Compiler Suite
|pgcc||C compiler driver.|
|pgCC||C++ compiler driver.|
|Fortran compiler driver.|
|pghpf||High Performance Fortran compiler driver.|
|pgcpuid||Display the CPU type the compiler sees and display the default -tp switch it will use.|
|pgaccelinfo||Display the accelerator GPU the compiler sees.|
|.c||C source files.|
|.f/for/f90/f95||Fortran source files.|
|.F/FOR/F90/F95||Fortran source files (containing macros) to be processed by the Fortran processor.|
|.hpf||High Performance Fortran source files.|
|.cuf||Fortran source files with CUDA extensions.|
|.CUF||Fortran source files with CUDA extensions to be processed by the Fortran processor.|
|.h||C/C++ header files.|
|.i||Preprocessed C source files.|
|.C/cc||C++ source files.|
|.d||Dependency files. They contain rules suitable for Makefile describing the dependencies of the source file.
Created by -MD option.
Now the compiler...
Beginning version 7.0, the default compiler options
can be placed in the ~/.mypgirc file (for every PGI compiler),
~/.mypgccrc file (for C compiler),
~/.mypgcpprc file (for C++ compiler),
~/.mypgfortranrc file (for Fortran compiler), etc. The file should contain
append PREOPTIONS=-fast; append POSTOPTIONS=-Mipa;
(Notice the semicolons.)
That is, you can set at most two default compiler options, one
of which precedes everything in the command-line, and the other
follows everything in the command-line. If you have more than one
append PREOPTIONS=.. or append POSTOPTIONS=.., only
the FIRST occurrence will be used.
Also note that you cannot use
space in the options. For example, instead of -tp barcelona-64, you
must use -tp=barcelona-64. Moreover, not all command-line
options can be used. For example, -### is not allowed.
|-c||Compile *.c and assemble *.s. NO linking.|
|-Idir||Also search dir for header files.
This can also be controlled by environmental variables
|-S||Compile *.c into assembly codes *.s. NO linking.|
|-Manno||Make the generated assembly codes more readable.|
|-E||Run preprocessor only. The output is sent to stdout.|
|-C||When running preprocessor, don't discard comments in the program.|
|-dM||Display definitions of all built-in macros.|
|-o file||Place output in file|
|-v||When compiling, also display the programs invoked by the compiler.|
|Display the programs invoked by the driver and exit.|
|-drystdin||Display standard header directories and exit.|
|-show||Display detailed information of current driver.|
|-V||Display the version number.|
|-#||When compiling, also display the programs invoked by the compiler.|
|-help=hidden||Display all available compiler switches, including
the hidden & undocumented ones (Yes, PGI has many of them!)
|-A||Follow strict ANSI C++ standard.|
|-a||Follow proposed ANSI C++ standard.|
|-B||Accept C++ style comments in C code.|
|--gnu_extensions||Accept GNU extensions.|
mode can be align, allcores, bind, nonuma, numa (use thread-CPU affinity).
|Predefine the macro name, with value 1, or with the specified value|
|-Uname||Un-define the (built-in or -D defined) macro name|
|Output a rule (to stdout) suitable for Makefile describing the dependencies of the source file.
-MM only outputs header files not in the system header directories.
This option implies -E option.
|-MD||The same as -M, but *.d files will be generated.|
|-MMD||The same as -MM, but *.d files will be generated.|
|-Minform=warn||Show warning messages.|
|-w||Suppress all warnings.|
|-Ldir||Also search dir for library files.
This can also be controlled by environmental variable
|-llibrary||Link to liblibrary
The linker searches libraries and object files in the order they are specified, so
foo.o -lz bar.o
will search library z after file foo.o but before bar.o, so if bar.o refers to functions
|-s||Remove all symbol information from the executable|
|-Bstatic||Produce statically linked executable|
|Produce shared libraries. For details, see here.|
|-Mnostartup||Don't link to the standard startup files (so the start point of a program is not main, but _start).
To compile crt1.o, one has to use this option.
Also see here for examples.
|-Mnostdlib||Don't link to the standard system libraries (e.g. libgcc.a) or startup files.|
|Whether PGI-provided libraries should be statically or dynamically linked.|
|Link to C++, PGF77, or PGF90 runtime libraries.|
|-Mmpi=mpilib||Link to MPI library.
mpilib can be mpich1, mpich2, hpmpi, mvapich1.
|-Mscalapack||Link to ScaLAPACK library.|
|-Rdir||Tell linker to add dir to the runtime shared/dynamic libraries search path.|
|-Wl,opt||Pass opt to the linker.|
|-rpath=dir||Tell linker to add dir to the runtime shared/dynamic libraries search path.|
|-m||Enable linker to output trace/link map information.|
|All the options between this pair are passed to the linker.|
|-g||Produce debugging information.|
|-gopt||Produce debugging information in the presence of optimization.|
|-Mkeepasm||Save all temporary/intermediate assembly files produced during compiling.|
|-traceback||Add debug information for runtime traceback. Should be used together
|-pg||Produce profiling information for pgprof.|
|-Mprof=option||Produce profiling information for pgprof.
option can be func, hwcts (PAPI must be installed), lines, mpich1
|-O2||Optimize even more.
This is default.
|Optimize yet more.|
This implies -O2 and other optimizations such as
loop unrolling, SSE instructions, loop redundancy elimination (LRE),
partial redundancy elimination (PRE),
Flush To Zero (FTZ) & Denormals Are Zero (DAZ) modes, etc.
|-Msmart||Invoke a post-pass assembly instruction scheduling optimization.|
|-Mdaz||Treat denormal values used as input to floating-point instruction as 0.|
|-Mflushz||Set denormal results from floating-point calculations to 0.|
Generate fast but less accurate code for math functions
(division, reciprocal, square root, reciprocal square root, etc)
Generate fast but low-precision code for math functions
(division, reciprocal, reciprocal square root)
Perform floating-point operations in strict conformance with the IEEE 754
standard. Some optimizations are disabled.
|-Minline||Enable function inlining.|
|-Mipa=fast,inline||Link time/Inter-procedural optimization.|
|Display compile-time optimization information
lvl can be all, ccff, ftn, ipa, loop, lre, mp, opt, par, pfo, unroll, vect..
Note: CCFF means "Common Compiler Feedback Format"
|-Mneginfo||Display messages why certain optimizations are disabled
Generate extra code after every function call to ensure that
the FPU register stack is in the expected state.
|-Msmartalloc=huge||Link to the huge page runtime library.|
|Profile guided optimization (PGO).|
|-Mconcur||Automatically paralellize loops.|
|-Mvect||Automatically vectorize loops.|
|-tp cpu||Generate code for specific cpu, e.g.
athlon, barcelona, barcelona-64, core2-64, istanbul-64, nehalem-64, p7-64, penryn-64, shanghai-64 ...
|-help=target||List all cpu which can be used
in "-tp cpu" switch.
|-ta=nvidia,sub_options||Generate code for NVIDIA
accelerator with specific sub_options, e.g.
cc20, cuda2.3, cuda3.0, fastmath...
|-pc=n||Round the significand to n bits, n can be 32, 64, 80.|
|-W0,-beta -#||(Undocumented) Enable beta release optimizations.|
|-Mchkstk||Generate code to check for sufficient stack stack upon subprogram entry.|
|-Mbounds||Generate code to check array bounds|
|-Mbyteswapio||(Fortran) Swap byte-order (big-endian to little-endian or vice
versa) during I/O of Fortran unformatted data.
|-Mchkptr||(Fortran) Check for NULL pointers.|
|-Mcray||(Fortran) Enable Cray compatibility mode.|
|(Fortran) Enable CUDA Fortran.
Enable emulation mode.
Run-time environmental variables
In addition to the standard OpenMP run-time environmental variables,
the following variables also affect run-time behavior of
|NCPUS||(OpenMP) Specify the number of processes or threads used in parallel regions.|
|NCPUS_MAX||(OpenMP) Specify the maximal number of processes or threads used in parallel regions.|
|MP_SPIN||(OpenMP) Specify the number of times to check a semaphore before calling
sched_yield (on Linux or Mac OS X) or _sleep (on Windows).
|MP_BIND||(OpenMP) Set to y to use thread-CPU affinity (binding processes or threads to a physical core/processor).|
|MP_BLIST||(OpenMP) If MP_BIND is set to y,
this variable specifically defines the thread-CPU relationship,
overriding the default values.
|MPSTKZ||(OpenMP) Specify the number of bytes (e.g. 2m, 4m)
allocated for each thread to use as the private
stack for the thread.
|PGI_HUGE_PAGES||Specify the number of huge pages (2 MB).
The purpose of huge pages is to
|ACML_FAST_MALLOC||Set to 1 to use optimized memory management for the BLAS function dgemm
This is a new feature introduced in ACML version 4.4.0.
|These two parameters further fine tune the behavior of ACML_FAST_MALLOC.
By default the limit is set to 64 chunks of size 10,000,000 bytes.
|ACML_FAST_MALLOC_DEBUG||Set to any value to dislpay the debugging information of ACML_FAST_MALLOC.|
|NO_STOP_MESSAGE||(Fortran) Set to any value to disable FORTRAN STOP message when STOP is called.|
|FORTRANOPT||(Fortran) This controls Fortran I/O behavior. Its value is a comma-separated list options, which
|PGI_TERM||This controls the stack trace-back and just-in-time debugging. Its value
is a comma-separated list options, which can be:
Each option can be disabled (which is default) by attaching no to it, e.g. noabort.
|PGI_TERM_DEBUG||This controls how the debugger is invoked. For example, it can be set to
gdb --quiet --pid %d
to use GDB instead.
|Set to any value to dislpay the stack usage when the program ends.|
PGI C Compiler built-in macros
|__cplusplus||Is defined if C++ compiler is in use.|
|Name of the current input file (as a C string constant)
This is ANSI C standard macro.
|__LINE__||Current input line number (as an integer constant)
This is ANSI C standard macro
|Date & time on which the preprocessor is run. (as C string constants)
These are ANSI C standard macros.
|__TIMESTAMP__||Last modification time of the input file (as a C string constant)|
|Evaluate to 1 to mean the compiler is ISO standard conformant.
__STDC_VERSION__ evaluates to a C string constant
__STDC__ is an ANSI C standard macro.
|Evaluate to integer constants representing the PGI
compiler version numbers (major/minor/patch level).
|__PGI||Defined for PGI compiler.|
|Defined for x86_64.|
|Defined for processors that supports MMX/SSE/SSE2... instructions.|
www.acsu.buffalo.edu/~charngda/pgi.html. Slightly horrified to discover the page was gone, so have saved it as it has saved me more than a few times.