引言
先从一小段代码说起:
#include <stdio.h>int main()
{int sum = 0;for (int i = 0; i < 100; i++) {sum += i;}printf("sum = %d\n", sum);return 0;
}
将代码以-O2
选项编译后,查看目标程序中的汇率指令:
gcc test.c -O2
objdump -d a.out
发现main函数汇编代码的第二行,是将立即数0x1356(十进制:4950)移入esi
寄存器中。也就是说,程序没有按原有的逻辑去执行循环累加,而直接返回了计算结果。相对没有加-O2
的汇编代码,精简了许多操作,而这些细微的差异如果不注意,会违背开发者的初衷,甚至影响程序预期结果。
那么-O2
到底包含了哪些编译选项?
# gcc选项解释
# -Q:使编译器在编译每个函数时输出函数名, 并在每个编译阶段结束时输出一些统计信息。
# 当出现在--help选项之前时, --help的输出内容会有所改变:
# 不再显示编译选项的通用描述, 而是显示该选项在当前的编译命令中是否开启。
# 对于有具体设置的值的选项, 会显示该选项被设置的具体值。
# --help=optimizers:显示所有的优化编译选项[root@localhost ~]# gcc -Q -O2 --help=optimizers | grep "启用" | wc -l
129
[root@localhost ~]# gcc -Q -O2 --help=optimizers | grep "启用"-faggressive-loop-optimizations [启用]-falign-labels [启用]-fasynchronous-unwind-tables [启用]-fauto-inc-dec [启用]-fbranch-count-reg [启用]-fcaller-saves [启用]-fcode-hoisting [启用]-fcombine-stack-adjustments [启用]-fcompare-elim [启用]-fcprop-registers [启用]-fcrossjumping [启用]-fcse-follow-jumps [启用]-fdce [启用]-fdefer-pop [启用]-fdelete-null-pointer-checks [启用]-fdevirtualize [启用]-fdevirtualize-speculatively [启用]-fdse [启用]-fearly-inlining [启用]-fexpensive-optimizations [启用]-fforward-propagate [启用]-ffp-int-builtin-inexact [启用]-ffunction-cse [启用]-fgcse [启用]-fgcse-lm [启用]-fguess-branch-probability [启用]-fhoist-adjacent-loads [启用]-fif-conversion [启用]-fif-conversion2 [启用]-findirect-inlining [启用]-finline [启用]-finline-atomics [启用]-finline-functions-called-once [启用]-finline-small-functions [启用]-fipa-bit-cp [启用]-fipa-cp [启用]-fipa-icf [启用]-fipa-icf-functions [启用]-fipa-icf-variables [启用]-fipa-profile [启用]-fipa-pure-const [启用]-fipa-ra [启用]-fipa-reference [启用]-fipa-sra [启用]-fipa-vrp [启用]-fira-hoist-pressure [启用]-fira-share-save-slots [启用]-fira-share-spill-slots [启用]-fisolate-erroneous-paths-dereference [启用]-fivopts [启用]-fjump-tables [启用]-flifetime-dse [启用]-flra-remat [启用]-fmath-errno [启用]-fmove-loop-invariants [启用]-fomit-frame-pointer [启用]-foptimize-sibling-calls [启用]-foptimize-strlen [启用]-fpartial-inlining [启用]-fpeephole [启用]-fpeephole2 [启用]-fplt [启用]-fprefetch-loop-arrays [启用]-fprintf-return-value [启用]-freg-struct-return [启用]-frename-registers [启用]-freorder-blocks [启用]-freorder-blocks-and-partition [启用]-freorder-functions [启用]-frerun-cse-after-loop [启用]-frtti [启用]-fsched-critical-path-heuristic [启用]-fsched-dep-count-heuristic [启用]-fsched-group-heuristic [启用]-fsched-interblock [启用]-fsched-last-insn-heuristic [启用]-fsched-rank-heuristic [启用]-fsched-spec [启用]-fsched-spec-insn-heuristic [启用]-fsched-stalled-insns-dep [启用]-fschedule-fusion [启用]-fschedule-insns2 [启用]-fshort-enums [启用]-fshrink-wrap [启用]-fshrink-wrap-separate [启用]-fsigned-zeros [启用]-fsplit-ivs-in-unroller [启用]-fsplit-wide-types [启用]-fssa-backprop [启用]-fssa-phiopt [启用]-fstdarg-opt [启用]-fstore-merging [启用]-fstrict-aliasing [启用]-fstrict-volatile-bitfields [启用]-fthread-jumps [启用]-fno-threadsafe-statics [启用]-ftrapping-math [启用]-ftree-bit-ccp [启用]-ftree-builtin-call-dce [启用]-ftree-ccp [启用]-ftree-ch [启用]-ftree-coalesce-vars [启用]-ftree-copy-prop [启用]-ftree-cselim [启用]-ftree-dce [启用]-ftree-dominator-opts [启用]-ftree-dse [启用]-ftree-forwprop [启用]-ftree-fre [启用]-ftree-loop-if-convert [启用]-ftree-loop-im [启用]-ftree-loop-ivcanon [启用]-ftree-loop-optimize [启用]-ftree-phiprop [启用]-ftree-pre [启用]-ftree-pta [启用]-ftree-reassoc [启用]-ftree-scev-cprop [启用]-ftree-sink [启用]-ftree-slsr [启用]-ftree-sra [启用]-ftree-switch-conversion [启用]-ftree-tail-merge [启用]-ftree-ter [启用]-ftree-vrp [启用]-funwind-tables [启用]-fvar-tracking [启用]-fvar-tracking-assignments [启用]-fweb [启用]
上述命令也可以查看-O3
、-O1
开启的编译选项,默认-O == -O1
。 如:
gcc -Q -O1 --help=optimizers | grep "启用"
gcc -Q -O3 --help=optimizers | grep "启用"
关于具体每个编译选项的解释,牵扯到太多的背景知识。可以通过man文档查找说明。也可以查阅GNU在线文档:Option Summary (Using the GNU Compiler Collection (GCC))
如何禁用某个函数的编译优化?
有通过在函数体中添加空的汇编指令的方法(不推荐):
void func()
{// ...asm volatile("");// ...
}
还有通过添加预处理指令的方式:
#pragma GCC optimize ("O0")
void func()
{// ...
}
但我还是更倾向在函数声明时就表示不使用编译优化:
void func() __attribute__((optimize("O0")));void func()
{// ...
}