CSAPP Bomb Lab 实验解析
Bomblab是csapp的第二个配套实验,该实验提供了一个bomb二进制文件和一个bomb.c源文件,我们的目标是运行bomb并按照提示一步步输入字符串,直到完成整个拆炸弹的流程。但是源文件中只提供了部分代码,所以我们需要通过反汇编工具 objDump 来分析bomb的汇编代码,推导出所有能够拆解炸弹的字符串。
调试前在explode_bomb的地址上设置断点,break *0x804986d。
Phase_1
反汇编代码:
Dump of assembler code for function phase_1:0x08049045 <+0>: sub $0x14,%esp0x08049048 <+3>: push $0x8049dac0x0804904d <+8>: pushl 0x1c(%esp)0x08049051 <+12>: call 0x8049474 <strings_not_equal>0x08049056 <+17>: add $0x10,%esp0x08049059 <+20>: test %eax,%eax0x0804905b <+22>: je 0x804906a <phase_1+37>0x0804905d <+24>: sub $0xc,%esp0x08049060 <+27>: push $0x10x08049062 <+29>: call 0x804986d <explode_bomb>0x08049067 <+34>: add $0x10,%esp0x0804906a <+37>: add $0xc,%esp0x0804906d <+40>: ret
分析
从函数名strings_not_equal可以看出这是比较两个字符串的函数,字符串相同则返回1,不同则返回0。接下来对返回值进行test,并判断test结果是否为0,为零(字符串相等)跳过explode_bomb,否则(字符串不等)执行explode_bomb引爆。
从代码中可以看到strings_not_equal函数的参数是%esp+0x1c和0x8049dac地址中的内容,而%esp+0x1c就是“input”的内容,因此只要保证输入内容等于0x8049dac地址中的内容,便能不引爆炸弹。
查找0x8049dac中的内容:
(gdb) x/sb 0x8049dac
0x8049dac: “Public speaking is very easy.”
即得到拆除第1个雷所需的第3个字符串,输入后运行如下:
The bomb phase 1 is defused. Congratulations!
Phase_2
反汇编代码:
Dump of assembler code for function phase_2:0x0804906e <+0>: push %esi0x0804906f <+1>: push %ebx0x08049070 <+2>: sub $0x2c,%esp0x08049073 <+5>: lea 0x10(%esp),%ebx0x08049077 <+9>: push %ebx0x08049078 <+10>: pushl 0x3c(%esp)0x0804907c <+14>: call 0x8049892 <read_six_numbers>0x08049081 <+19>: lea 0x2c(%esp),%esi0x08049085 <+23>: add $0x10,%esp0x08049088 <+26>: mov (%ebx),%eax0x0804908a <+28>: add $0x5,%eax0x0804908d <+31>: cmp %eax,0x4(%ebx)0x08049090 <+34>: je 0x804909f <phase_2+49>0x08049092 <+36>: sub $0xc,%esp0x08049095 <+39>: push $0x20x08049097 <+41>: call 0x804986d <explode_bomb>0x0804909c <+46>: add $0x10,%esp0x0804909f <+49>: add $0x4,%ebx0x080490a2 <+52>: cmp %esi,%ebx0x080490a4 <+54>: jne 0x8049088 <phase_2+26>0x080490a6 <+56>: add $0x24,%esp0x080490a9 <+59>: pop %ebx0x080490aa <+60>: pop %esi0x080490ab <+61>: ret
分析
由call了子程序read_six_numbers,可猜测第2个雷的字符串是6个数字,调用堆栈如下图
子程序返回后让esi指向图中esp-8的位置(-0x34+0x2c=-0x8),而通过查看子程序read_six_numbers可知ebx的值并没有发生改变,因此,ebx还是指向图中esp-0x1c的位置,即ebx+0x14=esi,而0x14=20=4×5;接下来执行操作A(即0x08049088,不引爆bomb的情况),将ebx加4后与esi比较,如果不相等则重新执行操作A,相等则功能结束,可以看出这个过程就是一个循环,将操作A执行5次。
对于操作A,则是比较[ebx-4]+5与[ebx]的大小,相等则将ebx加4,继续循环;不相等则执行explode_bomb引爆。这个过程将被执行5次,一共参与的元素是相邻的6个。可以猜测这6个元素是输入的6个数字,且后一个数字比前一个数字大于5(或小于5,此处是大于,具体需要看子程序存放的顺序),随便试一组数字就可以了,如果依次加5报错说明就是依次减5。
输入如下:0 5 10 15 20 25或1 6 11 16 21 26测试通过:
The bomb phase 1 is defused. Congratulations!
The bomb phase 2 is defused. Congratulations!
Phase_3
反汇编代码:
Dump of assembler code for function phase_3:0x080490ac <+0>: sub $0x1c,%esp0x080490af <+3>: lea 0x8(%esp),%eax0x080490b3 <+7>: push %eax0x080490b4 <+8>: lea 0x10(%esp),%eax0x080490b8 <+12>: push %eax0x080490b9 <+13>: push $0x804a0cf0x080490be <+18>: pushl 0x2c(%esp)0x080490c2 <+22>: call 0x8048950 <__isoc99_sscanf@plt>0x080490c7 <+27>: add $0x10,%esp0x080490ca <+30>: cmp $0x1,%eax0x080490cd <+33>: jg 0x80490dc <phase_3+48>0x080490cf <+35>: sub $0xc,%esp0x080490d2 <+38>: push $0x30x080490d4 <+40>: call 0x804986d <explode_bomb>0x080490d9 <+45>: add $0x10,%esp0x080490dc <+48>: cmpl $0x7,0xc(%esp)0x080490e1 <+53>: ja 0x8049147 <phase_3+155>0x080490e3 <+55>: mov 0xc(%esp),%eax0x080490e7 <+59>: jmp *0x8049de0(,%eax,4)0x080490ee <+66>: mov $0x2f2,%eax0x080490f3 <+71>: jmp 0x80490fa <phase_3+78>0x080490f5 <+73>: mov $0x0,%eax0x080490fa <+78>: sub $0 *0x80490x1ce,%eax0x080490ff <+83>: jmp 0x8049106 <phase_3+90>0x08049101 <+85>: mov $0x0,%eax0x08049106 <+90>: add $0x156,%eax0x0804910b <+95>: jmp 0x8049112 <phase_3+102>0x0804910d <+97>: mov $0x0,%eax0x08049112 <+102>: sub $0x32c,%eax0x08049117 <+107>: jmp 0x804911e <phase_3+114>0x08049119 <+109>: mov $0x0,%eax0x0804911e <+114>: add $0x238,%eax0x08049123 <+119>: jmp 0x804912a <phase_3+126>0x08049125 <+121>: mov $0x0,%eax0x0804912a <+126>: sub $0x39b,%eax0x0804912f <+131>: jmp 0x8049136 <phase_3+138>0x08049131 <+133>: mov $0x0,%eax0x08049136 <+138>: add $0x39b,%eax0x0804913b <+143>: jmp 0x8049142 <phase_3+150>0x0804913d <+145>: mov $0x0,%eax0x08049142 <+150>: sub $0x73,%eax0x08049145 <+153>: jmp 0x8049159 <phase_3+173>0x08049147 <+155>: sub $0xc,%esp0x0804914a <+158>: push $0x30x0804914c <+160>: call 0x804986d <explode_bomb>0x08049151 <+165>: add $0x10,%esp0x08049154 <+168>: mov $0x0,%eax0x08049159 <+173>: cmpl $0x5,0xc(%esp)0x0804915e <+178>: jg 0x8049166 <phase_3+186>0x08049160 <+180>: cmp 0x8(%esp),%eax0x08049164 <+184>: je 0x8049173 <phase_3+199>0x08049166 <+186>: sub $0xc,%esp0x08049169 <+189>: push $0x30x0804916b <+191>: call 0x804986d <explode_bomb>0x08049170 <+196>: add $0x10,%esp0x08049173 <+199>: add $0x1c,%esp0x08049176 <+202>: ret
分析
从代码可以看出,主程序又调用了一个叫做“sscanf”的子程序,从函数的名字就能看出是读入输入的参数,但是要输入几个参数现在还不知道。看下面一条指令“cmp $0x1,%eax”,将输入参数个数eax与1进行比较,大于1则继续执行,否则引爆,说明至少输入两个数据。
随便输入两个测试数据,break *0x804986d下断点,运行后程序断下,查看sccanf函数第二个参数内容如下
(gdb) x/sb 0x804a0cf
0x804a0cf: “%d %d”
因此,可以判断这个字符串是两个数字,且Sccanf函数调用可以还原为:sccanf(input,”%d %d”,&A,&B);调用堆栈还原如下
图中A、B就是存放读取的两个数字,继续查看代码,若A大于7,则引爆炸弹;后面若A大于5同样引爆炸弹。说明A取值为0~5之间,之后,由A值取值的不同跳转到不同的地址,跳转到地址0x8049de0+4×A。
查看0x8049de0地址处的内容,如下
(gdb) x/8xw 0x8049de0
0x8049de0: 0x080490ee 0x080490f5 0x08049101 0x0804910d
0x8049df0: 0x08049119 0x08049125 0x08049131 0x0804913d
其实这些地址都是phase3中的不同语句地址。可以看出其实是执行一个c语言中的case语句,根据输入的不同,所执行的命令也不同。 接着,就是对eax进行一些数值的运算,然后与B作比较。当A=0时,eax=0x2f2-0x1ce+0x156-0x32c+0x238-0x39b+0x39b-0x73=0x113(275),因此,此题答案不唯一,其中一组字符串为:0 275,输入后,运行如下
The bomb phase 1 is defused. Congratulations!
The bomb phase 2 is defused. Congratulations!
The bomb phase 3 is defused. Congratulations!
当A=1~5时,可以用同样的方法计算出相应的结果,在此不做赘述。
Phase_4
反汇编代码:
Dump of assembler code for function phase_4:0x080491a0 <+0>: sub $0x20,%esp0x080491a3 <+3>: lea 0x10(%esp),%eax0x080491a7 <+7>: push %eax0x080491a8 <+8>: push $0x804a0d20x080491ad <+13>: pushl 0x2c(%esp)0x080491b1 <+17>: call 0x8048950 <__isoc99_sscanf@plt>0x080491b6 <+22>: add $0x10,%esp0x080491b9 <+25>: cmp $0x1,%eax0x080491bc <+28>: jne 0x80491c5 <phase_4+37>0x080491be <+30>: cmpl $0x0,0xc(%esp)0x080491c3 <+35>: jg 0x80491d2 <phase_4+50>0x080491c5 <+37>: sub $0xc,%esp0x080491c8 <+40>: push $0x40x080491ca <+42>: call 0x804986d <explode_bomb>0x080491cf <+47>: add $0x10,%esp0x080491d2 <+50>: sub $0xc,%esp0x080491d5 <+53>: pushl 0x18(%esp)0x080491d9 <+57>: call 0x8049177 <func4>0x080491de <+62>: add $0x10,%esp0x080491e1 <+65>: cmp $0x58980,%eax0x080491e6 <+70>: je 0x80491f5 <phase_4+85>0x080491e8 <+72>: sub $0xc,%esp0x080491eb <+75>: push $0x40x080491ed <+77>: call 0x804986d <explode_bomb>0x080491f2 <+82>: add $0x10,%esp0x080491f5 <+85>: add $0x1c,%esp0x080491f8 <+88>: retDump of assembler code for function func4:0x08049177 <+0>: push %ebx0x08049178 <+1>: sub $0x8,%esp0x0804917b <+4>: mov 0x10(%esp),%ebx0x0804917f <+8>: mov $0x1,%eax0x08049184 <+13>: cmp $0x1,%ebx0x08049187 <+16>: jle 0x804919b <func4+36>0x08049189 <+18>: sub $0xc,%esp0x0804918c <+21>: lea -0x1(%ebx),%eax0x0804918f <+24>: push %eax0x08049190 <+25>: call 0x8049177 <func4>0x08049195 <+30>: add $0x10,%esp0x08049198 <+33>: imul %ebx,%eax0x0804919b <+36>: add $0x8,%esp0x0804919e <+39>: pop %ebx0x0804919f <+40>: ret
分析
由之前的经验看出,这个字符串为一个非0整数。程序中,将该整数传入func4函数中进行运算,若返回结果等于0x58980(362880),则拆除成功,否则引爆。分析func4的代码,用C语言表示为:
int func4(int n)
{int eax=1;if (n<1) return eax;eax=func4(n-1);eax=eax*n;return eax;
}
仔细想一下,其实这是个递归函数,其功能就是计算n的阶乘,而362880正好等于9的阶乘,所以这个整数应为9,输入后运行,提示如下,成功。
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
The bomb phase 1 is defused. Congratulations!
The bomb phase 2 is defused. Congratulations!
The bomb phase 3 is defused. Congratulations!
The bomb phase 4 is defused. Congratulations!
Phase_5
反汇编代码:
Dump of assembler code for function phase_5:0x080491f9 <+0>: sub $0x1c,%esp0x080491fc <+3>: lea 0x8(%esp),%eax0x08049200 <+7>: push %eax0x08049201 <+8>: lea 0x10(%esp),%eax0x08049205 <+12>: push %eax0x08049206 <+13>: push $0x804a0cf0x0804920b <+18>: pushl 0x2c(%esp)0x0804920f <+22>: call 0x8048950 <__isoc99_sscanf@plt>0x08049214 <+27>: add $0x10,%esp0x08049217 <+30>: cmp $0x1,%eax0x0804921a <+33>: jg 0x8049229 <phase_5+48>0x0804921c <+35>: sub $0xc,%esp0x0804921f <+38>: push $0x50x08049221 <+40>: call 0x804986d <explode_bomb>0x08049226 <+45>: add $0x10,%esp0x08049229 <+48>: mov 0xc(%esp),%eax0x0804922d <+52>: and $0xf,%eax0x08049230 <+55>: mov %eax,0xc(%esp)0x08049234 <+59>: cmp $0xf,%eax0x08049237 <+62>: je 0x8049267 <phase_5+110>0x08049239 <+64>: mov $0x0,%ecx0x0804923e <+69>: mov $0x0,%edx0x08049243 <+74>: add $0x1,%edx0x08049246 <+77>: mov 0x8049e00(,%eax,4),%eax0x0804924d <+84>: add %eax,%ecx0x0804924f <+86>: cmp $0xf,%eax0x08049252 <+89>: jne 0x8049243 <phase_5+74>0x08049254 <+91>: movl $0xf,0xc(%esp)0x0804925c <+99>: cmp $0x8,%edx0x0804925f <+102>: jne 0x8049267 <phase_5+110>0x08049261 <+104>: cmp 0x8(%esp),%ecx0x08049265 <+108>: je 0x8049274 <phase_5+123>0x08049267 <+110>: sub $0xc,%esp0x0804926a <+113>: push $0x50x0804926c <+115>: call 0x804986d <explode_bomb>0x08049271 <+120>: add $0x10,%esp0x08049274 <+123>: add $0x1c,%esp0x08049277 <+126>: ret
分析
与前面的分析类似,第5个字符串也是两个数字,可还原为
sccanf(input, ”%d %d, &A, &B);
分析得到运算流程为:A=A&0xf;且运算后A不能为0xf,之后就是执行一个循环,将地址0x8049e00+eax×4中的内容赋值给eax,然后ecx=ecx+eax。该循环一共执行8次,且要求8次后eax的值为0xf,ecx的值为B。
接下来就好办了,首先查看内存地址0x8049e00处的值如下
(gdb) x/16xw 0x8049e00
0x8049e00 <array.2934>: 0x0000000a 0x00000002 0x0000000e 0x00000007
0x8049e10 <array.2934+16>: 0x00000008 0x0000000c 0x0000000f 0x0000000b
0x8049e20 <array.2934+32>: 0x00000000 0x00000004 0x00000001 0x0000000d
0x8049e30 <array.2934+48>: 0x00000003 0x00000009 0x00000006 0x00000005
已知最后一次eax=0xf,倒推8次便知道第一次的eax值了,从而得到输入的第一个数字A的十六进制最低位的值,而输入的第二个数字B的值为第第2到第9个的eax的和。
edx= 8 7 6 5 4 3 2 1 0
eax= 15 6 14 2 1 10 0 8 4
所以第一个数字A=4+k×16(k为自然数),第二个数字B=15+6+14+2+1+10+0+8=56。
取k=1,则A=20,输入20 56,运行,成功!
Phase_6
反汇编代码:
Dump of assembler code for function phase_6:0x080492cf <+0>: sub $0x10,%esp0x080492d2 <+3>: push $0xa0x080492d4 <+5>: push $0x00x080492d6 <+7>: pushl 0x1c(%esp)0x080492da <+11>: call 0x80489b0 <strtol@plt>0x080492df <+16>: mov %eax,0x804c1740x080492e4 <+21>: movl $0x804c174,(%esp)0x080492eb <+28>: call 0x8049278 <fun6>0x080492f0 <+33>: add $0x10,%esp0x080492f3 <+36>: mov $0x8,%edx0x080492f8 <+41>: mov 0x8(%eax),%eax0x080492fb <+44>: sub $0x1,%edx0x080492fe <+47>: jne 0x80492f8 <phase_6+41>0x08049300 <+49>: mov 0x804c174,%ecx0x08049306 <+55>: cmp %ecx,(%eax)0x08049308 <+57>: je 0x8049317 <phase_6+72>0x0804930a <+59>: sub $0xc,%esp0x0804930d <+62>: push $0x60x0804930f <+64>: call 0x804986d <explode_bomb>0x08049314 <+69>: add $0x10,%esp0x08049317 <+72>: add $0xc,%esp0x0804931a <+75>: ret
分析
通过分析程序,第6个雷是将字符串用strtol转换为10进制数字。通过对堆栈调用的分析,调用完strtol后,将返会的十进制数字存入0x804c174地址中,紧接着直接将0x804c174(134529396)作为一个数字压入栈中,作为参数传入fun6函数,因此可知fun6子程序是对这个与输入无关的量进行操作,也就是对这个整数134529396进行操作,因此可以判断其返回值eax为定值,通过break *0x080492cf对phase_6下断,单步执行可知,返回值eax的值仍然为0x804c174,即eax=0x804c174。
接下来执行的操作是进行一个循环,把eax+8地址中的值赋给eax,循环8次,然后与ecx的值(其实是输入的字符串转换成的10进制数)进行比较,若相等则成功拆解!通过在cmp %ecx,(%eax)处设置断点,查看最终eax所指向地址中的内容,为0x5e(94),因此此题字符串前两位为数字94,第3位是非数字字符即可。例如,输入:94abc!23423423,运行如下
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
The bomb phase 1 is defused. Congratulations!
The bomb phase 2 is defused. Congratulations!
The bomb phase 3 is defused. Congratulations!
The bomb phase 4 is defused. Congratulations!
The bomb phase 5 is defused. Congratulations!
Congratulations! You’ve defused the bomb!
The bomb phase 6 is defused. Congratulations!
最终PhaseID中的内容为(有的题答案不唯一):
Public speaking is very easy.
0 5 10 15 20 25
0 275
9
4 56
94abc!23423423