Friday, February 20, 2009

Forensic analysis of the Windows registry in memory

1. Introduction

The Windows registry is a hierarchical database used in the Windows family of operating systems to store information that is necessary to configure the system (Microsoft Corporation, 2008). It is used by Windows applications and the OS itself to store all sorts of information, from simple configuration data to sensitive data such as account passwords and encryption
keys. As described in Section 2, researchers have found that the registry can also be an important source of forensic evidence when examining Windows systems. Another important yet non-traditional source of forensic data is the contents of volatile memory. By examining the contents of RAM, an investigator can determine a great deal about the state of the machine when the image was collected. Although techniques for analyzing and extracting meaningful
information from the raw data found in memory are still relatively new, guidance on the collection of physical memory is now a common part of many forensic best practice documents,
such as the NIST Special Publication ‘‘Guide to Integrating Forensic Techniques into Incident Response’’ (Kentet al., 2006).
Our work seeks to bring these two areas of research together by allowing investigators to apply registry analysis techniques to physical memory dumps. We will begin byexplaining the structure of the Windows registry as it is represented in memory, and describe techniques for accessing the registry data stored in memory. A prototype implementation of an in-memory registry parser will then be presented, along with some experimental results from several memory images. We will also discuss particular considerations investigators should be aware of when looking at the registry in memory. Finally, we will show that although under normal conditions the stable keys (see Section 3.3 for details on the distinction between stable and volatile keys) recovered from the in-memory copy of the registry are essentially a subset of those found in the on-disk copy, an attacker with access to kernel memory can alter the cached keys and leave those on disk unchanged. The operating system will then make use of the cached data from the registry, and a forensic examination of the disk will not detect the changes.Wewill show how analyzing the registry in memory can detect this attack.

2. Related work

Over the past several years, it has become increasingly clear that the registry can contain a great deal of information that is of use to forensic examiners. Research has shown that it contains such data as lists of recently run programs (Stevens 2006), logs of devices recently attached to the system (such as USB keys) (Carvey, 2007), and wireless networks the user connected to (Carvey, 2005a). Registry data, Carvey notes, ‘‘provides a wealth of information that the investigator can use to make his case’’ (Carvey, 2005a). Meanwhile, starting with the DFRWS 2005 Memory Analysis Challenge (DFRWS, 2005), a great deal of progress has been made towards cataloging the contents of physicalmemory on Windows systems and documenting how to make use of the information it contains. Using freely available tools, investigators can find disk encryption keys (Walters and Petroni, 2007), list processes and threads (Schuster, 2006), and detect the presence of some techniques used by malicious software such as DLL injection and hiding (Dolan-Gavitt, 2007; Walters, 2006). Attacks that alter cached data in memory to deceive the operating system while remaining hidden from tools that analyze non-volatile storage are not entirely unheard of. In their 2006 USENIX paper, Petroni et al. (2006) described an attack which modifies SELinux’s access vector cache to grant additional privileges while remaining undetected by traditional kernel integrity protections. More broadly, attacks on
cached data fall into the category of in-memory operating system alterations; other examples of this type of attack include Metasploit’s DLL injection payload (Metasploit, 2008), which loads executable code into the address space of an exploited process, and the Slammer worm, which
was never written to disk as it propagated from system to system (F-Secure, 2003).
Documentation about the inner workings of the Configuration Manager, the Windows subsystem that manages the registry, is sparse. The most complete reference is Windows Internals, by Russinovich and Solomon (2004), which describes some of the internal mechanisms, but does not provide enough detail, on its own, to allow extraction of the registry from memory. A blog post by Anand (2008) provides some more details, and gives an example of manually translating
a cell index into a virtual address using WinDbg. Finally, a series of posts by Dolan-Gavitt (2008a–c) provide lower-level technical details on the internal mechanisms of the Configuration
Manager.

3. The registry in memory

3.1. Overview
Although the Windows registry appears as a single hierarchy in tools such as regedit, it is actually made up of a number of different binary files called hives on disk. These files and their relationship to the hierarchy normally seen are described in KB256986 (Microsoft Corporation, 2008). The hive files themselves are broken into fixed sized bins of 0  1000 bytes, and each bin contains variable-length cells, which hold the actual registry data. References in hive files are
made by cell index, which is essentially a value that can be used to derive the location of the cell containing the referenced data. As for the structure of the registry data itself, it is generally
composed of two distinct data types: key nodes and value data.The structure can be thought of as analogous to a filesystem, where the key nodes play the role of directories and the values
act as files.1 One key difference, however, is that data in the registry always has an explicit associated type, whereas data on a filesystem is generally only weakly typed (for example,
through a convention such as file extension). To work with registry data in memory, it is necessary to find out where in memory the hives have been loaded and know how to translate cell indexes to memory addresses. It will also be helpful to understand how the Windows Configuration Manager works with the registry internally, and how we can make use of its data structures to tell us what the operating system itself maintains about the state of the registry.

3.2. Locating hives

The Configuration Manager in Windows XP references each hive loaded in memory using the _CMHIVE data structure.2 The _CMHIVE contains several pieces of metadata about the hive, such as its full path, the number of handles to it that are open, and pointers to the other loaded hives on the system (using the standard _LIST_ENTRY data structure used in many Windows kernel structures to form linked lists). It also has another important structure embedded within it, the _HHIVE, which contains the mapping table used to translate cell indexes
(more details on this are given in Section 3.3).
Our approach to finding hives in memory has two stages. First, we scan physical memory to find a single hive; this is easily accomplished, as each _HHIVE begins with a constant signature 0  bee0bee0 (a little-endian integer). Furthermore, the structure is allocated from the kernel’s paged pool, and has the pool tag CM10; these two indicators are sufficient to find valid _HHIVEs in all Windows XP images we have examined. Once a single instance has been found, the HiveList member is used to locate the others in memory.3 The pointers to the previous and next hives in the list are virtual addresses in kernel memory space, and must be translated to physical addresses using the page directory of some process.4 In typical Windows XP SP2 memory images, we found 13 hives: the NTUSER and UsrClass hives for the currently logged on user, the LocalService user, and the NetworkService user (total of six hives); the template user hive (‘‘default’’); the Security Accounts Manager hive (‘‘SAM’’); the system hive; the SECURITY hive; the software hive; and, finally, two volatile hives that have no on-disk representation. The two volatile hives deserve some special mention: one, the HARDWARE hive, is generated at boot and provides information on the hardware detected in the system. The other, the REGISTRY hive, contains only two keys, MACHINE and USER, which are used to provide a unified namespace in which to attach all other hives.
1 This analogy is borrowed from Farmer (2007), though it probably does not originate with him.
2 Data structures referenced in this paper can be found in the public debug symbols for Windows XP Service Pack 2 unless otherwise noted, and are viewable with the dt command in WinDbg.
3 This is substantially the same process described by Dolan-Gavitt (2008b).
4 Since the kernel-space portion of virtual address space is the same for all processes, any process will do.

No comments:

Post a Comment