ELF动态库加载技术

库用于将相似函数打包在一个单元中。Linux支持两种类型的库：静态库（在编译时静态绑定到程序）和动态库（在运行时绑定到程序）。Linux系统使用的动态库是ELF格式，后缀名为so。

1 加载

动态库内部划分为段，段分为不同的类型：

PT_LOAD段：包含代码或数据，是需要被映射到内存中的，每个段有不同的访问权限（读、些、执行）；
PT_DYNAMIC段：包含动态链接信息，如符号表、重定位表、引用的其他库等。

其他段类型暂不说明。

加载器将库文件第一个PT_LOAD段和最后一个PT_LOAD段之间的内容映射到一段连续的内存地址空间（好处是任意代码和数据的相对地址固定），其首地址称为基地址（如图）。

库的加载只是把文件内容映射到内存地址，但没有真正读取文件数据，在发生内存缺页异常时才由操作系统读入对应的文件数据到内存。延迟读取文件可以加快库的加载速度。

1.1 预链接

一般来说，映射的基地址是不固定的，但如果动态库使用了预链接（prelink）技术，则会被映射到预定的地址（保存在文件上）。如果预定的地址范围已经被占用了，则加载失败（Androidlinker是这样，其他加载器可能不同）。Prelink的好处是简化重定位，加快加载速度。

2 重定位

2.1 内部函数和变量

在没有使用prelink的情况下，库的基地址不是固定的（运行时才确定），其全局变量和函数的绝对地址也不是固定的。由于库加载之后任意代码和数据的相对地址是固定的（如前一节所述），因此一些系统（如x86）可以使用相对地址来访问全局变量和函数。ARM系统由于指令长度限制（32位），无法在指令中直接使用大范围的偏移量（但可通过寄存器指定），另外绝对地址在执行效率上要优于相对地址，因此还是需要重定位。

如这个例子：

[cpp] view plaincopyprint?

__attribute__((visibility("hidden")))int errBase = 1;
void setErr(){ errBase = 0x999; }

__attribute__((visibility("hidden")))int errBase = 1;
void setErr(){ errBase = 0x999; }

编译得到so，然后反编译（ARM架构）：

[plain] view plaincopyprint?

$ gcc -shared-nostdlib -o libtest.so test.c
$ objdump -dlibtest.so
000002c4<setErr>:
2c4: mov ip,sp
2c8: push {fp,ip, lr, pc}
2cc: sub fp,ip, #4
2d0: ldr r2,[pc, #12] ; r2 = &errBase
2d4: mov r3,#2448 ; r3 = 990
2d8: add r3,r3, #9 ; r3 += 9
2dc: str r3,[r2] ; *r2 = r3
2e0: ldm sp,{fp, sp, pc}
2e4: .word 0x0000109c ; 这里保存着errBase变量的地址

$ gcc -shared-nostdlib -o libtest.so test.c
$ objdump -dlibtest.so
000002c4<setErr>:
2c4:  mov    ip,sp
2c8:  push   {fp,ip, lr, pc}
2cc:  sub    fp,ip, #4
2d0:  ldr    r2,[pc, #12]      ; r2 = &errBase
2d4:  mov    r3,#2448          ; r3 = 990
2d8:  add    r3,r3, #9         ; r3 += 9
2dc:  str    r3,[r2]           ; *r2 = r3
2e0:  ldm    sp,{fp, sp, pc}
2e4:  .word  0x0000109c        ; 这里保存着errBase变量的地址

查看重定位表：

[plain] view plaincopyprint?

$ readelf -rlibtest.so
Relocationsection '.rel.dyn' at offset 0x2bc contains 1 entries:
Offset Info Type Sym.Value Sym. Name
000002e4 00000017 R_ARM_RELATIVE

$ readelf -rlibtest.so
Relocationsection '.rel.dyn' at offset 0x2bc contains 1 entries:
Offset    Info       Type            Sym.Value  Sym. Name
000002e4  00000017   R_ARM_RELATIVE

对比汇编代码和重定位表，2e4即是保存errBase变量地址的偏移量。

重定位表中一个RELATIVE类型的表项，指向变量和函数的相对地址，加载器把它加上基地址，使成为绝对地址。如果使用了prelink，则不需要进行重定位。

2.2 外部函数和变量

外部变量和函数是指目标库引用依赖库的变量和函数，需要加载器在依赖库的符号表查找对应的名称和绝对地址，然后写入目标库的全局偏移量表（GlobalOffset Table，简称GOT）。目标库通过GOT来访问外部变量和函数。

外部变量重定位对应一个GLOB_DAT类型的表项，外部函数重定位对应一个JMP_SLOT类型表项，表项的值是外部变量或函数的绝对地址，由加载器进行设置。

如这个例子：

[cpp] view plaincopyprint?

extern interrBase;
void setErr(){ errBase = 0x999; }

extern interrBase;
void setErr(){ errBase = 0x999; }

编译得到so，然后反编译（ARM架构）：

[cpp] view plaincopyprint?

$ gcc -shared-nostdlib -o libtest.so test.c
$ objdump -dlibtest.so
00000218<setErr>:
218: push {fp}
21c: add fp,sp, #0
220: ldr r3,[pc, #28] ; r3=GOT偏移
224: add r3,pc, r3 ; r3=GOT地址
228: ldr r2,[pc, #24] ; r2=errBase项在GOT的偏移
22c: ldr r3,[r3, r2] ; r3=errBase的地址
230: ldr r2,[pc, #20] ; r2=0x999
234: str r2,[r3] ; r3=r2
238: add sp,fp, #0
23c: pop {fp}
240: bx lr
244: .word 0x00008dc4 ; GOT偏移
248: .word 0x0000000c ; errBase在GOT的偏移
24c: .word 0x00000999

$ gcc -shared-nostdlib -o libtest.so test.c
$ objdump -dlibtest.so
00000218<setErr>:
218:  push   {fp}
21c:  add    fp,sp, #0
220:  ldr    r3,[pc, #28]    ; r3=GOT偏移
224:  add    r3,pc, r3       ; r3=GOT地址
228:  ldr    r2,[pc, #24]    ; r2=errBase项在GOT的偏移
22c:  ldr    r3,[r3, r2]     ; r3=errBase的地址
230:  ldr    r2,[pc, #20]    ; r2=0x999
234:  str    r2,[r3]         ; r3=r2
238:  add    sp,fp, #0
23c:  pop    {fp}
240:  bx     lr
244:  .word  0x00008dc4      ; GOT偏移
248:  .word  0x0000000c      ; errBase在GOT的偏移
24c:  .word  0x00000999

查看重定位表：

[cpp] view plaincopyprint?

$readelf -rlibtest.so
Relocationsection '.rel.dyn' at offset 0x210 contains 1 entries:
Offset Info Type Sym.Value Sym.Name
00008ffc 00000415 R_ARM_GLOB_DAT 00000000 errBase

$readelf -rlibtest.so
Relocationsection '.rel.dyn' at offset 0x210 contains 1 entries:
Offset      Info        Type              Sym.Value    Sym.Name
00008ffc    00000415    R_ARM_GLOB_DAT    00000000     errBase

8ffcc正好对应errBase的GOT表项地址。

2.3 延迟绑定

外部函数和变量的重定位需要查找依赖库的符号表，并进行字符串比较，效率较低，不过一般一个库使用的外部变量和函数都不会太多。如果使用了较多的外部函数，为了加快动态库加载速度，可以使用过程链接表（ProcedureLinkageTable，简称PLT），把外部函数的定位延迟到第一次调用的时候（称为延迟绑定）。函数延迟绑定需要编译器对函数调用生成额外的代码，主要由编译器实现。

看这个例子：

[cpp] view plaincopyprint?

voidprintf1(const char*, ...);
void setErr(){ printf1("setErr\n"); }

voidprintf1(const char*, ...);
void setErr(){ printf1("setErr\n"); }

对应汇编代码（x86-64）：

[html] view plaincopyprint?

4c0<printf1@plt>:
4c0: jmpq *0x200b3a(%rip)
4c6: pushq $0x0
4cb: jmpq 4b0 <_init+0x18>
5ac <setErr>:
5ac: push %rbp
5ad: mov %rsp,%rbp
5b0: lea 0x5f(%rip),%rdi
5b7: mov $0x0,%eax
5bc: callq 4c0
5c1: pop %rbp
5c2: retq

4c0<printf1@plt>:
4c0:  jmpq   *0x200b3a(%rip)
4c6:  pushq  $0x0
4cb:  jmpq   4b0 <_init+0x18>
5ac <setErr>:
5ac:  push   %rbp
5ad:  mov    %rsp,%rbp
5b0:  lea    0x5f(%rip),%rdi
5b7:  mov    $0x0,%eax
5bc:  callq  4c0
5c1:  pop    %rbp
5c2:  retq

调用printf1会调用printf1@plt，然后跳转到*0x200b3a(%rip)，即*(基地址+0x201000）。

如果是第一次执行，*0x(基地址+0x201000)的值是(基地址+4c6)，后面的代码会进行函数绑定，对应的重定位项是：

[plain] view plaincopyprint?

$ readelf -rlibtest.so
Relocationsection '.rela.plt' at offset 0x468 contains 2 entries:
201000 000300000007 R_X86_64_JUMP_SLO printf1 + 0

$ readelf -rlibtest.so
Relocationsection '.rela.plt' at offset 0x468 contains 2 entries:
201000 000300000007 R_X86_64_JUMP_SLO  printf1 + 0

绑定之后*0x(基地址+0x201000)会对应printf1函数的地址，下次再进入printf1@plt，就可以直接跳转到printf1函数了。

2.4 位置无关代码

一般来说，程序和动态库的代码和只读数据被加载到内存之后，可以被多个进程共享，但被写的脏数据则不能被多个进程共享。RELATIVE类型的重定位会修改代码段的变量地址，导致代码段被污染，从而不能被多个进程共享。为了让动态库的代码段可以在进程间共享，可以让编译器编译出位置无关代码（简称PIC），通过GOT来访问变量和函数。

PIC使代码段可在进程间共享，从而节省了内存，但是通过GOT表来访问变量和函数会比相对定位慢一点，如果没有需要则可以不使用PIC。