Deep dive into the Object creation flow in Windows

Handle Table internals.

Caution:

Before I start, it's necessary to say that this work is based solely on static analysis using IDA freeware 9.2 and my own knowledge. So, if you found any errors or gaps, please leave me a comment.

I need also to clarify that Pre and Post Operation Callbacks are not covered in this article because this concept is well described in the official MSDN documentation. All that I am going to say is that pre-operation callbacks are called before deciding which handle entry to use for the new object handle, and post-operation callback are called after finishing the handle entry setup.

Summary:

This is the last part of my reverse engineering research focused on the internals of two Windows kernel executive managers, the Object Manager which is responsible for managing and tracking different aspects of objects; and the Security reference Monitor which is mainly responsible for the access check phase.

In this part I will explain in detail the structure of handle tables and how the system links handles to objects and how handle table entries are stored and as the previous parts all involved routines and data structures are mentioned.

Before you read this part, I encourage you to read part1, part2 and part3.

Introduction:

Let's start with required definitions of some concepts I will be using in this article.

HANDLE: To prevent user-mode callers from accessing kernel memory directly, the Windows kernel implements an indirect method to access objects from user-mode. Each process in the system has its own separate set of identifiers where each one points indirectly to a specific object which resides in the kernel. Those identifiers are called Handles. Handles are declared as nt!HANDLE datatype which is just a nt!LPVOID typedef.

Handle Table: To ensure isolation between processes, the system doesn't store all handle entries in one global table. Instead, each process has its own table that stores entries that are valid only in this process' context. These tables are called Handle Tables. They are a non-contiguous a multi-level data structure that can store up to 16,777,216 entries. In the next sections you will understand why exactly the number is limited to this value. Handle tables are declared as nt!_HANDLE_TABLE which is considered the header of the table.

High level Overview:

Before we go deeper, I prefer to present a high-level description of the object creation steps.

First, it's necessary to clarify that the system doesn't provide a single generalized routine to create an object, the creation routine depends on the object type, as each type has its own one. For example, to create an event object almost all developers use kernelbase!CreateEventW( ) (or its ANSI counterpart kernelbase!CreateEventA( )) which is just a wrapper that calls ntdll!NtCreateEvent( ) which transitions the execution to the kernel through a system call. To create an Object of a different type you can't use the same routine, it's mandatory to use that type's creation routine as defined by the system.

Despite it seems that the creation flow cannot be generalized from a user mode perspective, the operating system kernel defines a common flow used to create any type of object which I will divide into the following 6 major steps:

1- Object allocation: Objects are simply memory blocks allocated either from the kernel paged or non-paged pool as indicated by the type they belong to. The main Object Manager routine involved in this step is nt!ObpAllocateObject( ) that verifies some known conditions to determine the presence of each optional header, then it calculates the required buffer size that can hold the following 3 parts respectively: the present optional headers, the main object header and finally the object body.

2- Object initialization: after allocating a buffer large enough to hold the Object, the system passes to the initialization step which is divided into 2 major phases. The first one is the specific initialization that consists of initializing the object body using the corresponding Object Type's initialization routine. For example to initialize an event object the system uses nt!KeInitializeEvent( ). The next initialization phase is done by the Object Manager, and it consists of constructing then assigning a security descriptor to the new object.

3- Access check: after initializing the object and its security descriptor, a new kernel executive manager is involved in this step which is the Security Reference Monitor. In this step the system will check if the caller has the right to access the Object in the way he wishes. The Security Reference Monitor compares the caller's access token with the security descriptor associated with the Object.

4- Optional Headers setup: In this step the system fills in different optional headers associated with the Object each one serving a different purpose. It charges the required pool quotas and security descriptor's quota for the calling process, initialize the object for exclusive access if it is requested, inserts a handle count entry in the handle count database associated with the object to track the calling process handle then finally if this is a new Object it links it to the list of objects belonging to the same type.

5- Object name lookup: Objects can have a unique path name associated with them. This path must follow a strict convention defined by the operating system, otherwise it's treated as invalid. The name lookup involves parsing each part of the path and check whether it's a valid child path to its predecessor. This parsing is done either manually or via the help a special ParseProcedure( ) associated with the Object Type to which the Object belongs. This complex part will be described in detail in the next parts.

6- Object Handle: after initializing all parts of the Object and doing all the necessary checks, the system creates a handle entry and links it to that Object, it also ensures that this entry is only valid in the calling process context (this is achieved through separate handle tables) and that it grants only the requested access rights.

NOTE: In this part, only the last step is covered in detail, the rest will be covered in the next parts.

User Vs Kernel Handles:

Before creating a handle entry and inserting it into a handle table, the system must determine which table to use. To do so, it checks whether the provided handle attributes bitmask includes nt!OBJ_KERNEL_HANDLE or not. The presence of this attribute directs the system to use the kernel handle table (sometimes called the system handle table) which is declared as nt!ObpKernelHandleTable. Otherwise, the system uses the table associated with the current process stored in the current thread's corresponding nt!_ETHREAD::Tcb::ApcState::Process. To get the table memory address, the system just accesses the corresponding nt!_EPROCESS::ObjectTable member. It's worth to say that if the current process is the same as the system process nt!PsInitialSystemProcess, the kernel handle table is used too.

NOTE: The current process is the process to which the current thread is attached. It can be different than the one that actually owns the thread.

Handle Table levels:

After determining the right handle table to use, the system next step is reserving an entry for the new object handle. But before doing this, the table needs to be in a state where it is safe to access and manipulate it. So, to prevent multi-simultaneous access, the system first acquires the table's push lock for exclusive use. Then, it verifies whether there are existing free entries, or the system must insert a new entry for the new handle. The corresponding nt!_HANDLE_TABLE::FirstFreeHandle member represents the value of the first handle covered by the free entries in the table. If this value equals 0, the system knows that the table is full and it must enlarge it by inserting new entries if possible. Otherwise, the value of this member is used as it is and its corresponding entry is updated to point to the object.

To insert new entries in the handle table, the system calls nt!ExpAllocateHandleTableEntrySlow( ) which is a kernel routine defined solely for this purpose. Before I continue, I see it mandatory to explain handle table levels and how each one of them is structured. Handle tables can be in one of three levels defined by the system. To determine the current level, the lowest 2 bits of the nt!_HANDLE_TABLE::TableCode member are checked.

The first case is when these two bits are both unset (nt!_HANDLE_TABLE::TableCode & 0x3 == 0x0) that indicates that the table is currently a low-level table. Low level tables are created by a kernel routine named nt!ExpAllocateLowLevelTable( ) that starts by allocating 4096 bytes of paged pool memory. Of course, the allocation can fail if the current process has reached its paged pool quota limit. The allocated pool region is treated as an array of handle table entries each one declared as nt!_HANDLE_TABLE_ENTRY. The handle table's corresponding nt!_HANDLE_TABLE::NextHandleNeedingPool member represents the value of the first handle that doesn't currently have a corresponding allocated entry. This value is used as a starting point to initialize all entries. The only member that is initialized is the nt!_HANDLE_TABLE_ENTRY::NextFreeHandleEntry which is set to the value of the handle that corresponds to its following entry (it's important to say that the last entry's NextFreeHandleEntry member is set to 0x0 to direct the system to create a new table). As I said at the beginning handle values are incremented by 4 not by 1 so during the initialization loop the system increments the NextFreeHandleEntry value by 4 each time to correctly initialize the entries. Since the system allocates 4096 bytes for low-level tables, doing a simple calculation we get that each one can store up to 256 entries and since handle values are incremented by 4 not 1 so a single low-level table can actually cover a total of 1024 values (256 * 4 = 1024).

Handle tables can also mid-level which is indicated by having the lowest 2 bits of the table code equal to 1 (nt!_HANDLE_TABLE::TableCode & 0x3 == 0x1). Mid-level tables are contiguous memory regions that store 8 bytes entries where each one representing a pointer to a low-level table. To create a mid-level table, the system calls nt!ExpAllocateMidLevelTable( ) that also starts by allocating 4096 bytes of paged pool memory. As in creating low-level tables, the allocation can fail also if the current process has reached its paged pool quota limit. The system doesn't leave the newly allocated region empty; depending on my reversing of nnt!ExpAllocateMidLevelTable( ) it allocates one low-level table and store its address at the first entry of the mid-level one. Since the system allocates 4096 bytes for mid-level tables, doing a simple calculation we can get that each one can store up to 512 entries and since each low-level table can cover up to 1024 handle values, so each mid-level one can totally cover up to 524,288 values (512 * 1024 = 524,288).

The last type is the top-level table which is identified by having the lowest 2 bits of the table code equal to 2 (nt!_HANDLE_TABLE::TableCode & 0x3 == 0x2). To create such table, the system calls nt!ExpAllocateTablePagedPool( ) which also allocates paged pool memory. For each table this table allocates a total of 1024 bytes, and like mid-level table, this memory region is divided into 8 bytes entries each one representing a pointer to a mid-level table (not a low-level one). Of course, the system doesn't leave the created table empty, depending on my reversing, a single mid-level is created, and its address is stored in the first entry. Since the system allocates 1024 bytes for top-level tables, doing a simple calculation we get that each one stores up to 128 entries and since each mid-level table can cover up to 524,288 handle values, so each top-level table covers up to 67,108,864 values (524288 * 128 = 67,1088,64)

You might be asking for what purpose the other bits of the table code are used. The answer is that the system uses them to store the start address of the memory region allocated for the table which is described in the previous paragraphs (nt!_HANDLE_TABLE::TableCode & 0xFFFFFFFFFFFFFFFC). To clarify, the address stored in these bits depends on the table's current level. For example, if the table is currently a top-level one, the address of the region allocated to represent the top-level table is used not the address of the one of the mid-level tables stored by it.

Now, after detailing the layout of handle tables and how the system structures them, let's dive into how nt!ExpAllocateHandleTableEntrySlow( ) performs its work. It starts by checking the table code to determine the current level. If the table is currently a low level, a new mid-level table gets allocated following the same previously described steps. The system doesn't discard the old entries, instead the address of the region storing them is set as the first entry in the new mid-level table, and the address of the additional low-level table allocated by the system is set as the second entry. And before returning, the table code must be updated to match the new level, so the system just set the lowest 2 bits of the corresponding nt!_HANDLE_TABLE::TableCode to 1 to indicate that the handle table has become a mid-level one. If the table is currently a mid-level one, nt!ExpAllocateHandleTableEntrySlow( ) first checks whether it is full or not and its decision depends on the value of the corresponding nt!_HANDLE_TABLE::NextHandleNeedingPool that as I previously said represents the value of the value (not the index) of the first handle that doesn't have a corresponding allocated handle entry. To determine whether a mid-level table is currently full or not, the system calculates how many low-level tables this table is storing. Since each low-level table can cover up to 1024 handle values so using this formula: {NextHandleNeedingPool / 1024} gives us the number of low-level tables currently stored by the table. If this number is less than 512 which is the capacity of mid-level tables, the system simply allocates a new just another low-level table and uses the first free entry to store its address. Otherwise, the system calls nt!ExpAllocateTablePagedPool( ) to create a top-level table than nt!ExpAllocateMidLevelTable( ) to create an additional mid-level table. If both allocations have succeeded, the system preserves the old mid-level table and all low-level tables referenced by it by setting its address as the first entry in the newly created top-level table. Finally, it sets the address if the new mid-level table as the second entry and update the table code setting its lowest 2 bits to 2 to indicate that the handle table has become a top-level one. The last case that nt!ExpAllocateHandleTableEntrySlow( ) deals with is when the handle table is currently a top-level table. In this case, the first step is checking whether the table is full of mid-level tables. Again, the decision depends on the NextHandleNeedingPool value. If we use this formula: {NextHandleNeedingPool / 524,288} we get the total number of mid-level tables. If this number is less than 128 which represents the maximum for top-level tables. The next step depends on whether the last mid-level table in the handle table is full or not. If it not full, the system allocates only a low-level one and stores its address in the first free entry in that table. Otherwise, an additional mid-level table is created, and its address is stored in the first free entry in the handle table. If the number of mid-level tables has reached 128, the system fails the operation immediately.

Before returning to the caller, nt!ExpAllocateHandleTableEntrySlow( ) updates some necessary members on the handle table. First, it sets the corresponding nt!_HANDLE_TABLE::FirstFreeHandle to the value of the handle that corresponds to the first entry in the newly allocated region. As I said before, he system always allocates a new low-level table no matter to whether it allocates another highest-level table or not, this ow-level table's first entry's corresponding handle value is used as the new FirstFreeHandle member. Next, it updates the corresponding nt!_HANDLE_TABLE::NextHandleNeedingPool member too incrementing it by 1024 because at each time the system allocates at most one additional low-level table that covers up to 1024 handle values, and since these values are now considered having a corresponding entry, the first value after the value that corresponds to the last allocated entry is the first one that needs allocation the next time. The last member that is updated is the nt!_HANDLE_TABLE::LastFreeHandleEntry that represents the address in memory of the last allocated entry.

Handle entry setup:

After reserving a handle entry for the new object handle, the system checks whether this task succeeded or not. If the handle table's corresponding nt!_HANDLE_TABLE::FirstFreeHandle member is still 0, the operation is considered failed and the system performs some cleanup using nt!ObpDecrementHandleCount( ) then returns to the caller with nt!STATUS_INSUFFICIENT_RESOURCES. Otherwise, the system uses the FirstFreeHandle member as the handle value and gets its corresponding entry from the handle table. The handle value is simply used as an index, but not as it is because as I said it's a multiple of 4, so the system first divides it by 4 then uses it as an index into the right handle table. If the table is not a low-level one, the system cannot use the handle value directly as an index it must first get the index of the low-level table in which the corresponding entry is stored. To get this index the handle value is divided by 1024 which is the range covered by a single low-level table. Then the system gets the index of the entry in the table by using this formula: {FirstFreeHandle % 1024} then divides the result by 4 to convert it into a real index. As the mid-level table case, the system uses the same steps to get the right entry if the handle table is a top-level one.

After getting the right handle entry that corresponds to our handle value, the system updates the handle table's nt!_HANDLE_TABLE::FirstFreeHandle overriding it with the entry's corresponding nt!_HANDLE_TABLE_ENTRY::NextFreeHandleEntry which is described earlier. The corresponding nt!_HANDLE_TABLE::LastFreeHandleEntry member is also updated if the FirstFreeHandle member has turned 0 indicating that current table is full. Finally, the nt!_HANDLE_TABLE::HandleCount gets incremented.

The last step is attaching the handle entry to the object so the system can know that this handle value is used to reference that object. The system sets a combination of the object main header memory address and the lower 3 bits of the provided handle attribute bitmask (HandleAttribute & 0x7) which masks off all attributes but the nt!_OBJ_INHERIT as the nt!_HANDLE_TABLE_ENTRY::Object member. The granted access rights determined in the access check phase (detailed in part 2) is also set in the entry's nt!_HANDLE_TABLE_ENTRY::GrantedAccess member.

NOTE: the formula used to get the index of the handle table entry corresponding to the handle value is simplified to make it more understandable; but in reality, the system uses another complicated form of it but it's the same in terms of result and logic.

Conclusion:

That is all for this part, I hope you have learned something from this article and its predecessors. This work took me a month to finish it then another month to publish it. Stay tuned, other posts about various topics will be published from time to time.

please check out my LinkedIn and GitHub accounts.

Search This Blog

winware