先通过翻译过一遍文章,然后再对每个章节进行总结
摘要
Use-after-free vulnerabilities remain difficult to detect and mitigate, making them a popular source of exploitation. Existing solutions in- cur impractical performance/memory overhead, require specialized hardware, and/or guarantee only protection, but not detection
释放后使用漏洞仍然难以检测和缓解,使其成为流行的利用来源(exploitation 含义)。现有解决方案带来不切实际的性能/内存开销,需要专门的硬件,和/或只保证保护,而不保证检测
In this paper, we propose DangZero, a new solution to detect use-after-free vulnerabilities as they occur. DangZero builds on a traditional page protection and aliasing scheme, where objects are made inaccessible after a free, and subsequent accesses are imme- diately detected. In contrast to prior solutions using alias-based detection, DangZero relies on direct page table access in ring 0 to provide a much more efficient implementation
在本文中,我们提出了DangZero,这是一种检测释放后使用漏洞的新解决方案。DangZero建立在传统的页面保护和别名方案之上,在该方案中,对象在释放后无法访问,随后的访问被立即检测到。与使用基于别名的检测的先前解决方案相比,DangZero依赖于环0中的直接页表访问来提供更有效的实现
The key idea is that, by giving the program’s allocator direct access to the page tables, we can efficiently manage and invalidate vulnerable objects. To safely implement this, we build upon a unikernel-like design, where virtualization provides ring-0 (guest-mode) access, isolation, as well as compatibility with existing Linux programs. Moreover, we show direct page table access serves as an efficient building block for garbage collection-style alias reclaiming. Doing so provides the ability to safely reuse freed areas and address the scalability issues plaguing state-of-the-art alias-based solutions.
关键思想是,通过让程序的分配器直接访问页表,我们可以有效地管理易受攻击的对象并使其无效。为了安全地实现这一点,我们构建了一个类似单内核的设计,其中虚拟化提供了环0(访客模式)访问、隔离以及与现有Linux程序的兼容性。此外,我们展示了直接页表访问是垃圾回收机制式别名回收的有效构建块。这样做提供了安全重用空闲区域的能力,并解决了困扰最先进的基于别名的解决方案的可扩展性问题。
Our experi- mental results confirm that DangZero provides accurate detection guarantees with significantly lower overhead than competing state- of-the-art solutions (e.g., 18% saturated throughput degradation on long-running programs such as the Nginx web server).
我们的经验结果证实,DangZero提供了准确的检测保证,与竞争的最先进的解决方案相比,开销显著降低(例如,在Nginx Web服务器等长时间运行的程序上,吞吐量下降了18%)。
总结
- DangZero的程序分配器可直接访问页表,以此令释放后的对象无效,且可检测到对其访问
- 直接访问页表有助于提高垃圾回收机制的效率
介绍
Temporal memory errors remain an important concern in the pro- tection of computer systems against bugs and exploits. Use-after- free (UAF) bugs were ranked #7 in the CWE top 25 of the most common and impactful issues in software [40]. Additionally, Mi- crosoft reports that UAF bugs are the second most common root cause of vulnerabilities and continue to be a preferred target for exploitation [ 39]. Approaches to defend against such threats can be classified as offering immediate detection or (merely) protection against exploitation. Providing detection of bugs is important in both offline (e.g., testing) and online (e.g., sampling [ 51]) deploy- ment scenarios, as well as for bug triaging. Unfortunately, existing solutions in either category are problematic
时序内存错误仍然是保护计算机系统免受错误和攻击的一个重要问题。在CWE软件中最常见和最有影响力的问题前25名中,UAF错误排名第7[40]。此外,Mi-crosoft报告说,UAF错误是漏洞的第二个最常见的根本原因,并且仍然是利用的首选目标[39]。防御此类威胁的方法可以分为提供即时检测或(仅)防止利用。 在离线(例如测试)和在线(例如采样[51])部署场景以及bug分类中,提供错误检测都很重要。不幸的是,这两个类别中的现有解决方案都有问题
Guaranteeing UAF protection is typically more efficient than im- mediate detection and existing protection systems attempt to mini- mize their performance impact by means of a variety of techniques: type-safe memory reuse [ 5 , 52] (which, however, can only preserve type safety), reference counting [ 50 ] (which, however, is not applica- ble to arbitrary C/C++ programs), one-time allocation [54 ] (which, however, cannot bound memory usage), and garbage collection- style (GC) solutions [ 4 , 19 , 23, 34 ]. While GC-style solutions have been gaining momentum for their reported efficiency, recent studies evidence nontrivial, fundamental costs with GC-style techniques— often hiding behind concurrency and generous provisioning of memory/computational power [ 14 ]. Further drawbacks are that many solutions cannot protect against exploits that do not rely on memory reuse [ 5], while most of the compiler-based solutions (with exceptions [4, 5 , 19 , 54 ]) cannot handle unmodified binaries. Most importantly, none of the solutions in this category can provide strong UAF detection guarantees.
保证UAF保护通常比即时检测更有效,现有的保护系统试图通过各种技术来最小化它们对性能的影响:类型安全的内存重用[5,52](然而,它只能保持类型安全)、引用计数[50](然而,它不适用于任意的C/C++程序)、一次性分配[54](然而,它不能限制内存使用)和垃圾回收机制-风格(GC)解决方案[4,19,23,34]。虽然GC风格的解决方案因其报告的效率而获得动力,但最近的研究证明,GC风格技术的成本是非同寻常的,基本的——通常隐藏在并发和内存/计算能力的慷慨配置背后[14]。进一步的缺点是,许多解决方案无法防止不依赖内存重用的攻击[5],而大多数基于编译器的解决方案(除了例外[4,5,19,54])无法处理未经修改的二进制文件。最重要的是,此类解决方案中没有一个可以提供强大的UAF检测保证。
Most UAF detection-focused systems rely on compiler instrumen- tation to track and invalidate pointers to freed objects [ 30, 49 , 53 , 55]. Despite dedicated optimizations [53], such solutions still incur non- trivial performance overhead. Less costly solutions rely on special hardware support [22, 57 ] (limiting deployability) or on object IDs [ 9, 13 , 15 , 22 , 25 , 41 ] (or poison values [47 ]) to detect UAFs (only) until a predetermined number of memory reuse events oc- curs (limiting security guarantees). Here also, most compiler-based solutions cannot handle unmodified binaries
大多数以UAF检测为中心的系统依靠编译器工具来跟踪和无效指向已释放对象的指针[30,49,53,55]。尽管进行了专门的优化[53],但这种解决方案仍然会产生不小的性能开销。成本较低的解决方案依赖于特殊的硬件支持[22,57](限制可部署性)或对象ID[9,13,15,22,25,41](或毒值[47])来检测UAF(仅),直到预定数量的内存重用事件发生(限制安全保证)。这里,大多数基于编译器的解决方案也不能处理未经修改的二进制文件。
Nonetheless, binary-compatible UAF detection systems are de- scribed in literature [17 , 18 ]. Such solutions create a new virtual page (alias) for each memory allocation and map it to the same physical page as the original object. As a result, every object re- ceives a unique (unused) pointer, and the object (and its pointers) can easily be invalidated upon free by revoking the page mapping. Unfortunately, such alias-based solutions rely on the kernel for page protection and aliasing, and incur high overhead due to the extra syscalls and kernel administration costs. Moreover, state-of-the-art solutions [17 ] still suffer from impractical scalability issues due to virtual memory address space exhaustion—as we will show, this occurs in a matter of days on a heavily loaded web server.
尽管如此,文献[17,18]中描述了二进制兼容的UAF检测系统。这种解决方案为每个内存分配创建一个新的虚拟页面(别名),并将其映射到与原始对象相同的物理页面。因此,每个对象都重新获得一个唯一的(未使用的)指针,并且该对象(及其指针)可以很容易地通过撤销页面映射在空闲时失效。不幸的是,这种基于别名的解决方案依赖内核进行页面保护和别名,并且由于额外的系统调用和内核管理成本而产生高开销。此外,由于虚拟内存地址空间耗尽,最先进的解决方案[17]仍然存在不切实际的可扩展性问题——正如我们将展示的,这发生在负载沉重的Web服务器上的几天内。
In this paper, we introduce DangZero, an efficient, scalable, and binary-compatible UAF detection system. The key idea is to rely on direct page table access in ring 0 (i.e., the highest privilege level normally only running OS kernels) to implement a traditional alias- based scheme in a much more efficient way. Drawing inspiration from modern unikernel-like designs [29 ], DangZero relies on virtu- alization extensions and a privilege backend such as Kernel Mode Linux (KML) [36 ] to provide direct access to the page tables. This strategy allows us to transparently run (and isolate) arbitrary user- space programs in ring 0 guest mode, while safely providing them with direct access to their own (guest) page tables
在本文中,我们介绍了DangZero,这是一种高效、可扩展且二进制兼容的UAF检测系统。关键思想是依靠环0中的直接页表访问(即最高权限级别通常只运行操作系统内核),以更有效的方式实现传统的基于别名的方案。从现代单内核设计[29]中汲取灵感,DangZero依靠虚拟化扩展和权限后端,如内核模式Linux(KML)[36]来提供对页表的直接访问。这种策略允许我们在环0访客模式下透明地运行(和隔离)任意用户空间程序,同时安全地为它们提供对自己(访客)页表的直接访问
We show that directly accessing page tables can crucially make alias-based UAF detection systems practical in two ways. First, by granting the program’s memory allocator page table access, it can efficiently manage aliases by directly updating page table mappings. Doing so eliminates the need for operating system involvement and the corresponding (syscall and kernel administration) overheads.
我们表明,直接访问页表可以通过两种方式使基于别名的UAF检测系统变得实用。首先,通过授予程序的内存分配器页表访问权限,它可以通过直接更新页表映射来有效地管理别名。这样做消除了操作系统参与的需要和相应的(系统调用和内核管理)开销。
Second, page tables already track important metadata about the virtual memory address space of the program and can also accom- modate extra application-specific metadata. We use this observation to design an efficient alias reclaiming system and address the vir- tual memory address space exhaustion issues of prior alias-based solutions [17 ]. The goal is to allow safe reuse of virtual addresses, once we confirm that dangling pointers to the object (alias) no longer exist. Our design is similar, in spirit, to that of prior GC-style solutions [ 4, 19], but with two crucial differences. First, DangZero’s metadata management is uniquely efficient, since it can piggyback and expand on the metadata already present in the page tables (e.g., the present bit pinpointing the resident pages to scan for dangling pointers). Moreover, since DangZero reclaims virtual aliases rather than objects in physical memory, our reclaiming strategy is not prone to the typical performance/memory tradeoff of GC-style tech- niques [ 14 ]. Indeed, as we shall see, our alias reclaiming strategy is very efficient, allowing DangZero (a detection system) to out- perform even state-of-the-art GC-style protection systems [ 4 , 19 ] on long-running benchmarks (which commonly feature frequent, short-lived allocations), without having to resort to memory over- provisioning or concurrent reclaiming on spare CPU cores.
其次,页表已经跟踪了关于程序虚拟内存地址空间的重要元数据,也可以容纳额外的application-specific元数据。我们利用这一观察结果设计了一个高效的别名回收系统,并解决了先前基于别名的解决方案的虚拟内存地址空间耗尽问题[17]。目标是允许虚拟地址的安全重用,一旦我们确认指向对象(别名)的悬空指针不再存在。我们的设计在精神上与先前的GC风格解决方案[4,19]相似,但有两个关键区别。首先,DangZero的元数据管理非常高效,因为它可以搭载和扩展页表中已经存在的元数据(例如,当前位精确定位驻留页面以扫描悬空指针)。此外,由于DangZero回收虚拟别名而不是物理内存中的对象,我们的回收策略不容易出现GC风格技术[14]的典型性能/内存权衡。事实上,正如我们将看到的,我们的别名回收策略非常有效,允许DangZero(一种检测系统)在长时间运行的基准测试(通常具有频繁、短暂的分配)上优于最先进的GC风格保护系统[4、19],而不必求助于内存过度配置或备用CPU内核上的并发回收。
We have evaluated DangZero on standard benchmarks (SPEC CPU 2006 and 2017) and long-running application benchmarks (the Nginx web server in particular). On SPEC CPU 2006, DangZero reported a geomean performance overhead of only 16% (and 22% on SPEC CPU 2017) compared to 40% for the state-the-art alias-based UAF detection system [17 ]. On Nginx, DangZero reported saturated overheads as low as 11-18%, significantly lower than state-of-the- art UAF protection/detection systems, with consistently modest (and bounded) memory overhead
我们在标准基准测试(SPEC CPU 2006和2017)和长期运行的应用程序基准测试(特别是Nginx Web服务器)上评估了DangZero。在SPEC CPU 2006上,DangZero报告的geomean性能开销仅为16%(在SPEC CPU 2017上为22%),而最先进的基于别名的UAF检测系统为40%[17]。在Nginx上,DangZero报告的饱和开销低至11-18%,明显低于最先进的UAF保护/检测系统,内存开销始终适中(且有限)
To summarize, we make the following contributions:
• A new approach to detect use-after-free bugs based on alias allocation with virtualization-based direct page table access.
• A novel solution for alias reclaiming.
• A prototype of DangZero using KML as a privilege backend
•An evaluation to show that DangZero significantly outper- forms prior detection systems and even state-of-the-art GC- style protection systems on long-running benchmarks.
总而言之,我们做出了以下贡献:
• 一种基于别名分配的新方法,通过virtualization-based直接页表访问来检测释放后使用的错误。
• 别名回收的新解决方案。
• 使用KML作为特权后端的DangZero原型
• 评估表明,DangZero在长期运行的基准测试中明显优于先前的检测系统,甚至是最先进的GC式保护系统。
总结
- 分类为即时检测和进行UAF保护以防止利用
- 已经有页面保护和别名的机制(Oscar),但其系统调用开销高,且虚拟内存地址空间容易耗尽,回收机制不好,DangZero基于此进行改进
- 用户空间程序能直接访问所在进程的页表
- 别名回收系统可在对象的悬空指针不存在时,允许虚拟地址安全重用
背景
Use-after-free
Use-after-free (UAF) bugs are temporal memory errors present in unsafe languages such as C and C++, which arise due to heap allocated objects being dereferenced after already being freed. These bugs are possible since (so-called dangling) pointers to freed objects remain intact even if the pointed memory location is no longer valid. Attackers typically exploit UAF bugs and the corresponding dangling pointers by forcing memory reuse after the free, but before the use. However, depending on the allocator design, exploitation without memory reuse (with allocator metadata playing the role of the target object) is possible [ 5]. Listing 1 shows a trivial example of a UAF bug. The temporal nature of these bugs makes them hard to detect, both visually in the code as well as through program analysis, and many mitigation designs aimed to neutralize UAF bugs suffer from significant (runtime/memory) overhead. In this paper, we show such cost is not fundamental and direct page table access can unlock an efficient and scalable alias-based solution
释放后使用(UAF)错误是不安全语言(如C和C++)中存在的时序内存错误,它是由于堆分配的对象在已经被释放后被解引用而产生的。这些错误是可能的,因为即使指向的内存位置不再有效,指向释放对象的(所谓的悬空)指针也会保持不变。攻击者通常通过在释放之后但在使用之前强制内存重用来利用UAF错误和相应的悬空指针。然而,根据分配器设计,没有内存重用的利用(分配器元数据扮演目标对象的角色)是可能的[5]。清单1显示了一个UAFbug的简单示例。这些错误的时序特性使得它们很难在代码中直观地检测到,也很难通过程序分析检测到,许多旨在中和UAF错误的缓解设计都面临着巨大的(运行时/内存)开销。在本文中,我们展示了这种开销不能从根本解决问题,直接页表访问可以解锁一种高效且可扩展的基于别名的解决方案
Page tables
Page tables are a software-maintained data structure that is used by the memory management unit (MMU) of the CPU to describe how to map virtual to physical memory. On most common architectures, page tables are stored as a hierarchical tree, where certain bits of the virtual address are used to select the entry in the respective level of the page table. A page table entry (PTE) stores the address to the next level of the tree, or (for the last level) the result of the address translation. Additionally, PTEs store a limited number of metadata bits, such as permissions of that mapping and whether the entry is valid (“present”). Finally, each PTE contains a number of bits that are ignored by hardware, and thus can be used by the operating system for additional information. Most 64-bit architectures use 4-level page tables, each table consisting of 512 entries, yielding a 48-bit (256 TB) virtual address space. Some modern CPUs also feature 5-level page tables, but for the remainder of this paper we assume a 4-level page table structure for simplicity. Many different names exist for referring to the different levels of these structures; for this paper we simply refer to them as L4 through L1 (with L4 the root/first table, and L1 the leaves/last level).
页表是一种软件维护的数据结构,由CPU的内存管理单元(MMU)用来描述如何将虚拟内存映射到物理内存。在最常见的架构中,页表存储为分层树,其中虚拟地址的某些位用于选择页表相应级别的条目。页表条目(PTE)将地址存储到树的下一级,或(对于最后一级)地址转换的结果。此外,PTE存储有限数量的元数据位,例如该映射的权限以及条目是否有效(“存在”)。 最后,每个PTE包含一些被硬件忽略的位,因此操作系统可以使用这些位来获取更多信息。大多数64位架构使用4级页表,每个表由512个条目组成,产生48位(256 TB)的虚拟地址空间。一些现代CPU也具有5级页表,但为了简单起见,我们在本文的其余部分假设为4级页表结构。存在许多不同的名称来指代这些结构的不同级别;在本文中,我们简单地将它们称为L4到L1(L4是根/第一个表,L1是叶/最后一个级别)。
Typically, each process has its own set of page tables, describing the address space of that process. Linux splits the available address space in half, giving the bottom half to user space and keeping the top half for its own data. This means each user process has 128 TB of virtual addresses available. To request new mappings, or change existing mappings, the process (and its memory allocator) issues system calls such as brk, mmap, and mremap. On top of the page tables, Linux also maintains its own data structures, containing information for each consecutive virtual memory area (VMA)
通常,每个进程都有自己的一组页表,描述该进程的地址空间。Linux将可用地址空间分成两半,将下半部分留给用户空间,并保留上半部分用于自己的数据。这意味着每个用户进程都有128 TB的可用虚拟地址。为了请求新的映射,或更改现有的映射,进程(及其内存分配器)发出系统调用,例如brk、mmap和mremap。在页表的顶部,Linux还维护自己的数据结构,包含每个连续虚拟内存区域(VMA)的信息
When running a virtual machine (VM) using hardware virtual- ization extensions, there are two levels of page tables: the guest page tables, and the extended page tables (EPT) on the host. The former behave exactly as described above, and give the guest the illusion of running directly on the hardware. The EPT is managed by the hypervisor and is similar to normal page tables, except it translates every guest-physical address to a host-physical address
当使用硬件虚拟化扩展运行虚拟机(VM)时,有两个级别的页表:客户页表和主机上的扩展页表(EPT)。前者的行为与上述完全一样,给客户一种直接在硬件上运行的错觉。EPT由管理程序管理,类似于普通页表,只是它将每个客户物理地址转换为主机物理地址
Access to privileged CPU features
To achieve direct page table access, DangZero requires access to privileged features normally reserved for ring 0. The Dune [6] project presented a practical implementation through the use of a lightweight virtual environment. In particular, the application runs in ring 0 (guest mode) of a specialized “virtual process” environment. This provides the application access to all privileged features (e.g., guest page tables), while still being isolated from the rest of the (host) system by the hypervisor.
为了实现直接页表访问,DangZero需要访问通常为环0保留的特权功能。Dune[6]项目通过使用轻量级虚拟环境提供了一个实用的实现。特别是,应用程序在专门的“虚拟进程”环境的环0(访客模式)中运行。这为应用程序提供了对所有特权功能(例如访客页表)的访问,同时仍然被管理程序与(主机)系统的其余部分隔离开来。
Dune used a small library operating system (libOS) running in the guest alongside the application, to manage basic kernel tasks so that unmodified Linux binaries could run. Additionally, a specialized (KVM-based) hypervisor mapped system calls issued by the guest via VM exits to Linux syscalls on the host.
Dune使用在应用程序旁边的访客中运行的小型库操作系统(libOS)来管理基本的内核任务,以便未经修改的Linux二进制文件可以运行。 此外,专门的(基于KVM的)管理程序将访客通过VM出口发出的系统调用映射到主机上的Linux系统调用。
注:实现上libOS应该就是trusted目录,可以将主机其他未经修改的二进制文件放到此目录中运行
Of similar spirit is the Kernel Mode Linux (KML) [ 36 ] project, which allows programs to run in ring 0 alongside the Linux kernel. KML has the advantage of not requiring expensive VM exits for every system call a la Dune. Similar to Dune, KML still requires a virtual environment for isolation, that is to protect the rest of the system. The resulting design effectively transforms Linux into a libOS and the process into a unikernel—and recent application opti- mization work has shown KML can be efficiently used as such [29 ]
类似的精神是内核模式Linux(KML)[36]项目,它允许程序与Linux内核一起在环0中运行。KML的优点是不需要为每个系统调用la Dune都需要昂贵的VM退出。与Dune类似,KML仍然需要一个虚拟环境来隔离,即保护系统的其余部分。由此产生的设计有效地将Linux转换为libOS,并将进程转换为单内核——最近的应用程序优化工作表明KML可以有效地用作此类[29]
总结
- 页表项中有一些标志位来表示条目信息,如条目是否存在
- libOS使用KML作为单内核
威胁模型假定
We assume a standard threat model, with an attacker seeking to exploit arbitrary use-after-free vulnerabilities in a victim binary program (written in an unsafe language), for the purpose of in- formation disclosure, privilege escalation, etc. We consider arbi- trary use-after-free exploits regardless of whether memory reuse and other exploitation techniques (e.g., memory massaging) are involved. We assume the program is free from other vulnerabilities (e.g., buffer overflows) or otherwise hardened against them with orthogonal mitigations
我们假设一个标准的威胁模型,攻击者试图利用受害者二进制程序(用不安全的语言编写)中的任意释放后使用漏洞,以达到信息泄露、权限提升等目的。我们考虑任意释放后使用漏洞,无论是否涉及内存重用和其他利用技术(例如内存篡改)。我们假设该程序没有其他漏洞(例如缓冲区溢出),或者通过正交缓解措施对它们进行了强化
学习内容
“二进制兼容的UAF(Use-After-Free)检测” :能够检测未经修改的二进制可执行文件中的UAF漏洞的能力。这意味着检测工具或方法不需要访问程序的源代码或特定的编译版本,而是可以直接应用于已编译的二进制文件。这对于分析和保护现有应用程序以识别和修复UAF漏洞非常有用,因为它不依赖于源代码的可用性或可访问性。这种方法通常需要使用反汇编和静态分析技术来分析二进制文件的执行路径和内存访问,以检测UAF漏洞。
单内核是什么