Windows APC

这个话题是一个巨大的坑。我毕业设计的时候就想实现一个APC Inject,但是发现它那个好像存在一些缺陷,而后又听说存在一种内核级别的Inject,可以解决前者的缺陷问题,但是网上资料实在是少。自从上班之后,慢慢也习惯了读源码解决问题,而且机缘巧合之下得知原来Windows有开源部分的内核代码!于是打算用这个内容好好的研究一下这个


Windows APC

Windows Apc是一个涉及到Windows 内核的一个话题,可以作为了解Windows内核机制的一个重要入口。

APC(Asynchronous Procedure Calls,异步过程调用)

网上对于APC的文档似乎不多,微软官方的解释如下:

APC表示在指定线程上下文中异步调用一个函数。当一个APC插入到线程的调用队列中时,系统将会发出一个软件中断。之后每当线程被挂起,它就会调用这个APC函数。由内核产生的APC称为内核态(kernel-mode)APC,而由用户应用调用的APC称为用户态(user-mode)APC。

APC与线程

每一个线程都有自己的APC队列(APC queue),可以使用APIQueueUserAPC从而将一个APC插入到线程的APC队列中。线程会调用QueueUserAPC中指定的函数。只有将一个APC放入了线程的APC队列中,线程才有机会调用对应的APC函数。

与线程代码的关联

最初我认为APC应该是一种附加在Windows操作系统中的一个特性,但是查看了内核相关代码中后才发现,APC比我想象的还要和Windows底层关联密切:
查看线程对象,能够找到一些和APC相关的内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef struct _KTHREAD {

union {
KAPC_STATE ApcState;
struct {
UCHAR ApcStateFill[KAPC_STATE_ACTUAL_LENGTH];
BOOLEAN ApcQueueable;
volatile UCHAR NextProcessor;
volatile UCHAR DeferredProcessor;
UCHAR AdjustReason;
SCHAR AdjustIncrement;
};
};

KSPIN_LOCK ApcQueueLock;

一个线程对象使用ApcState对象记录当前的Thread对象所依附的Process对象。同时,因为在Windowes中的线程不但能够依附在当前进程上下文,在特定的场合(例如创建新的进程的时候)可能会依附到另一个进程中,因此在每一个线程对象中,都包含两个指向KAPC_STATE对象的指针:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
    PKAPC_STATE ApcStatePointer[2];
union {
KAPC_STATE SavedApcState;
struct {
UCHAR SavedApcStateFill[KAPC_STATE_ACTUAL_LENGTH];
CCHAR FreezeCount;
CCHAR SuspendCount;
UCHAR UserIdealProcessor;
UCHAR CalloutActive;
#if defined(_AMD64_)
BOOLEAN CodePatchInProgress;
#elif defined(_X86_)
UCHAR Iopl;
#else
UCHAR OtherPlatformFill;
#endif

};
};

在一般情况下,ApcStatePointer[0]指向当前进程上下文,而ApcStatePointer[1]指向备份的进程上下文。

APC的调用时机

当一个用户态APC插入队列后,线程只会在自己处在可警告状态(alertable state)之后才会进行调用。这个可警告状态是当进程调用SleepEx, SignalObjectAndWait, MsgWaitForMultipleObjectsEx, WaitForMultipleObjectsEx, WaitForSingleObjectEx这几个函数的时候才会进入的状态。通俗的说,就是当前的进程进入一种类似于挂起的状态。如果在APC插入APC队列前,这个等待状态(也就是前文的可警告状态)结束了,则这个APC函数就不会被调用。不过相对这个APC过程就会在下一次线程进入等待状态时被调用。

PS: ReadFileEx, SetWaitableTimer, SetWaitableTimerEx, 和 WriteFileEx 函数中的回调函数,其实就是用APC过程来实现的

而具体插入一个APC,则可以通过如下的函数实现:

1
2
3
4
5
6
7
NTSTATUS *NtQueueApcThreads(
_In_ HANDLE ThreadHandle,
_In_ PVOID ApcRoutine,
_In_ PVOID ApcRoutineContext OPTIONAL,
_In_ PVOID ApcStatusBlock OPTIONAL,
_In_ ULONG ApcReserved OPTIONAL
);

这个函数会往对应的ThreadHandle处插入一个用户态的APC,然后当前线程进入Alertable状态时,就会执行当前的APC请求:

以下是一个进行User Mode下APC调用的例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
PVOID ApcTest() {
std::cout << "ApcTest1" << std::endl;
return NULL;
}

PVOID ApcTest2() {
std::cout << "ApcTest2" << std::endl;
return NULL;
}
HMODULE hNtdll = GetModuleHandle(L"ntdll.dll");
if (hNtdll == NULL) {
std::cout << "Could get ntdll" << std::endl;
}
// 注意,
NtQueueApcThread = (NTSTATUS(NTAPI *)(HANDLE, PVOID, PVOID, PVOID, ULONG)) GetProcAddress(hNtdll, "NtQueueApcThread");
if (NtQueueApcThread == NULL) {
std::cout << "Could not get NtQueueApcThread" << std::endl;
}
HANDLE hThread = GetCurrentThread();
NtQueueApcThread(hThread, &ApcTest, 0, 0, 0);
std::cout << "Add APC1" << std::endl;
NtQueueApcThread(hThread, &ApcTest2, 0, 0, 0);
std::cout << "Add APC2" << std::endl;
SleepEx(3, TRUE);
std::cout << "Check the APC" << std::endl;

测试结果输出为:

1
2
3
4
5
Add APC1
Add APC2
ApcTest1
ApcTest2
Check the APC

当调用SleepEx时,此时线程进入alertable的状态,内核态会利用ntdll中的KiUserApcDispatcher调用当前我们的插入到APC队列中的函数,并且以一种类似中断的方式遍历当前队列,从而完成用户态APC调用。
PS:这个地方虽然没有体现出来,不过用户态的APC只会被调用一次。即是说,一旦发生调用,当前APC就会离开队列。

内部同步(官方定义APC的同步时机)

当一个IO请求发起的时候,会分配一个叫做 I/O 请求包(I/O request packet, IRP)的结构体。通过同步IO线程可以构建IRP包,将其发送到设备栈(device stack)中,并且在内核中等待IRP完成。使用异步IO的话,则线程会构建IRP包并将其发送到设备栈。此时设备栈可能会立刻完成当前请求,也可能会发送一个等待状态来表面那个当前请求做出进展。当收到这个状态的时候,IRP仍然与线程相关联,可以通过中断线程,或者通过调用类似于CancelIO的API来种猪过程。同时,这个线程可以在设备栈处理当前IRP请求的时候继续完成其他的工作。

系统通过以下几种方式来通知线程IRP请求完成了

  • 用操作结果来修改IRP结构体的成员。线程就能够在操作完成的时候轮询到这个完成
  • 往结构体中发送一个事件,以便线程能够在事件上完成同步,并且在IRP操作完成的时候被唤醒
  • 将当前的IRP插入到线程的APC等待队列中,则线程在进入到可警告状态后执行返回到一个等待状态中表明其将会执行APC 函数的时候执行这个插入的IRP请求。
  • 将IRP插入到I/O的完成端口,它将由等待完成端口的下一个线程执行。

在I/O完成端口上等待的线程不会在可警告状态下等待。 因此,如果这些线程向线程发出设置为完成APC的IRP,那么这些IPC完成将不会及时发生; 只有当线程从I/O完成端口获取请求然后恰好进入可警告的等待时,它们才会发生。
完成端口的概念和IPC相关

原理研究

Apc进行插入的过程

由于单纯看User Mode态下的APC 调用以及微软自己的文档,并不能很清晰的了解到整个APC调用过程,我们这里用windows WRK的相关代码对APC过程进行分析。
首先我们看到结构体

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
//
//
// Asynchronous Procedure Call (APC) object
//
// N.B. The size of this structure cannot change since it has been exported.
//

typedef struct _KAPC {
UCHAR Type;
UCHAR SpareByte0;
UCHAR Size;
UCHAR SpareByte1;
ULONG SpareLong0;
struct _KTHREAD *Thread;
LIST_ENTRY ApcListEntry;
PKKERNEL_ROUTINE KernelRoutine;
PKRUNDOWN_ROUTINE RundownRoutine;
PKNORMAL_ROUTINE NormalRoutine;
PVOID NormalContext;

//
// N.B. The following two members MUST be together.
//

PVOID SystemArgument1;
PVOID SystemArgument2;
CCHAR ApcStateIndex;
KPROCESSOR_MODE ApcMode;
BOOLEAN Inserted;
} KAPC, *PKAPC, *PRKAPC;

这个就是wrk中给出的APC相关的结构体,这里我们关注其中几个成员变量

  • ApcListEntry: 双向指针对象,帮助当前的APC插入到APC队列中。
  • KernelRoutine: 当前APC请求中内核模式下调用的函数,函数地址空间在内核里。
  • RundownRoutine: 当前APC在被释放内容的时候(例如线程退出的时候)调用的函数,所指函数在内核内存区域。
  • NormalRoutine: 当前ACP请求中,用户模式下进行的异步调用函数。这个函数的地址空间位于用户态上,但是通常只作为入口,下文会提到。
  • NormalCotext: 此时的APC请求的上下文,一般情况下为真正的回调函数所在的位置
  • SystemArgument*: 这个是进行Apc调用的时候,传入的参数
  • ApcStateIndex: 当前APC的种类,下文会提到.
  • ApcMode: 当前APC的种类,分为如下几种
1
2
3
4
5
typedef enum _MODE {
KernelMode,
UserMode,
MaximumMode
} MODE;

从成员变量中,我们可以知道一个APC对象要有一下几个基本特征

  • 有内核态/用户态/退出时候的回调函数
  • APC会存储一个请求的上下文
  • ACP会记录当前的请求种类

通过记录上下文和对应的调用函数,从而让APC在整个操作系统中能够记录下当前要进行异步调用的程序。有了可以调用的对象,自然要有一个记录当前调用对象状态的结构体,从而决定当前的调用是否要进行。这个结构体就是KAPC_STATE

1
2
3
4
5
6
7
8
9
10
11
12
// APC state
//
// N.B. The user APC pending field must be the last member of this structure.
//

typedef struct _KAPC_STATE {
LIST_ENTRY ApcListHead[MaximumMode]; // 这个地方是apclist的一个双向链表头
struct _KPROCESS *Process; // 当前APC对应的进程结构体
BOOLEAN KernelApcInProgress; // 内核态的APC是否在执行
BOOLEAN KernelApcPending; // 当前APC是否在Kernel中挂起
BOOLEAN UserApcPending; // 当前APC是否在用户态挂起
} KAPC_STATE, *PKAPC_STATE, *PRKAPC_STATE;

这个对象的ApcListHead中会记录当前线程中存放的APC的状态。
为什么LIST_ENTRY不需要写出其对应类型?这个是一个内核用的结构体,只要使用对应的宏,就可以很方便的存储各种类型的双向指针。

前面提到过,一个线程在某些情况下,是可以挂靠Attach到其他进程中的,那么当前的APC请求就会因为上下文的切换变得有所不同。这个时候,KAPC中自带的ApcStateIndex成员就会展示当前的线程上下文的状态:

1
2
3
4
5
6
typedef enum _KAPC_ENVIRONMENT {
OriginalApcEnvironment,
AttachedApcEnvironment,
CurrentApcEnvironment,
InsertApcEnvironment
} KAPC_ENVIRONMENT;

一般来说,一个APC请求的APCStateIndex就会处于前两个状态。OriginalApcEnvironment表示线程处在创建线程的进程中,而AttachedApcEnvironment表示线程处在挂靠的进程中。同时,正如前文提到的,在_KThread结构体中,存在如下变量

1
PKAPC_STATE ApcStatePointer[2];

这个指针其实本质上就是

1
2
ApcStatePointer[OriginalApcEnvironment]
ApcStatePointer[AttachedApcEnvironment]

两个指针分别存储当前进程的APC以及挂靠进程的上下文状态。线程就是用这两个指针来保存了其所在进程的基本信息。例如在获取当前进程信息的APIPsGetCurrentProcess中,其实现如下

1
2
3
4
5
6
7
8
#define _PsGetCurrentProcess() (CONTAINING_RECORD(((KeGetCurrentThread())->ApcState.Process),EPROCESS,Pcb))
PEPROCESS
PsGetCurrentProcess(
VOID
)
{
return _PsGetCurrentProcess();
}

这里获取进程最终就是通过找到了Thread指针指向的APCState中的Process,而这个APCState在未挂靠进程的时候指向ApcStatePointer[OriginalApcEnvironment],而挂靠后指向ApcStatePointer[AttachedApcEnvironment]

基本的结构体大约是介绍完了,那么APC是如何发生一次插入的呢?这里我们可以从前文提到的NtQueueApcThread入手,学习一下如何插入一个APC:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
NTSYSAPI
NTSTATUS
NTAPI
NtQueueApcThread(
__in HANDLE ThreadHandle,
__in PPS_APC_ROUTINE ApcRoutine,
__in_opt PVOID ApcArgument1,
__in_opt PVOID ApcArgument2,
__in_opt PVOID ApcArgument3
)

{
PETHREAD Thread;
NTSTATUS st;
KPROCESSOR_MODE Mode;
PKAPC Apc;

PAGED_CODE();

Mode = KeGetPreviousMode ();

st = ObReferenceObjectByHandle (ThreadHandle,
THREAD_SET_CONTEXT,
PsThreadType,
Mode,
&Thread,
NULL);
if (NT_SUCCESS (st)) {
st = STATUS_SUCCESS;
// 不允许内核线程拥有用户态的APC
if (IS_SYSTEM_THREAD (Thread)) {
st = STATUS_INVALID_HANDLE;
} else {
Apc = ExAllocatePoolWithQuotaTag (NonPagedPool | POOL_QUOTA_FAIL_INSTEAD_OF_RAISE,
sizeof(*Apc),
'pasP');

if (Apc == NULL) {
st = STATUS_NO_MEMORY;
} else {
// 初始化一个APC对象,插入到当前线程中,此时设置PspQueueSpecialApc(内核线程)以及ApcRoutine(用户态),以及ApcArgument1(真实要调用的用户态的)
KeInitializeApc (Apc,
&Thread->Tcb,
OriginalApcEnvironment,
PspQueueApcSpecialApc,
NULL,
(PKNORMAL_ROUTINE)ApcRoutine,
UserMode,
ApcArgument1);
// 然后将当前的APC插入等待队列中
if (!KeInsertQueueApc (Apc, ApcArgument2, ApcArgument3, 0)) {
ExFreePool (Apc);
st = STATUS_UNSUCCESSFUL;
}
}
}
ObDereferenceObject (Thread);
}

return st;
}

如上可以看到插入一个APC的全流程:
利用ObReferenceObjectByHandle查看找到需要插入APC的Thread句柄->申请一块内核,用于临时存储之后设置的APC属性->设置当前APC的基本属性,包括对应的thread,运行环境,SystemRoutine,NormalRoutine等->其中NormalRoutine作为跳板执行ApcArgument1的真实APC插入函数->将当前设置完成的APC使用对饮API KeInserTQueueApc塞入对应的线程队列->解除对当前Thread的句柄引用(以防句柄对应对象不能被及时的收回)

这里注意到,SystemRoutine中插入的函数叫做PspQueueApcSpecialAPC,这个函数实际的用途就是将当前的APC对象释放:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
VOID
PspQueueApcSpecialApc(
IN PKAPC Apc,
IN PKNORMAL_ROUTINE *NormalRoutine,
IN PVOID *NormalContext,
IN PVOID *SystemArgument1,
IN PVOID *SystemArgument2
)
{
PAGED_CODE();

UNREFERENCED_PARAMETER (NormalRoutine);
UNREFERENCED_PARAMETER (NormalContext);
UNREFERENCED_PARAMETER (SystemArgument1);
UNREFERENCED_PARAMETER (SystemArgument2);

ExFreePool(Apc);
}

为了避免内存泄漏,SystemRoutine中都要有一个将之前申请的内存释放的过程。然后我们检查一下当前的APC被插入到哪儿去了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
BOOLEAN
KeInsertQueueApc (
__inout PRKAPC Apc,
__in_opt PVOID SystemArgument1,
__in_opt PVOID SystemArgument2,
__in KPRIORITY Increment
)
{

// ........
//
// If APC queuing is disabled or the APC is already inserted, then set
// inserted to FALSE. Otherwise, set the system parameter values in the
// APC object, insert the APC in the thread APC queue, and set inserted to
// true.
//

// 当前线程允许插入,并且当前的APC未插入任何队列
if ((Thread->ApcQueueable == FALSE) ||
(Apc->Inserted == TRUE)) {
Inserted = FALSE;

} else {
Apc->Inserted = TRUE;
Apc->SystemArgument1 = SystemArgument1;
Apc->SystemArgument2 = SystemArgument2;
KiInsertQueueApc(Apc, Increment);
Inserted = TRUE;
}

//
// Unlock the thread APC queue lock, exit the scheduler, and return
// whether the APC was inserted.
//

KeReleaseInStackQueuedSpinLockFromDpcLevel(&LockHandle);
KiExitDispatcher(LockHandle.OldIrql);
return Inserted;
}

VOID
FASTCALL
KiInsertQueueApc (
IN PKAPC Apc,
IN KPRIORITY Increment
)
{
// .....
Thread = Apc->Thread;
if (Apc->ApcStateIndex == InsertApcEnvironment) {
Apc->ApcStateIndex = Thread->ApcStateIndex;
}
// 取出当前线程中查找的ApcState状态
ApcState = Thread->ApcStatePointer[Apc->ApcStateIndex];
if (Apc->NormalRoutine != NULL) {
// 若 NormalRoutine 不为空,此时才可能是用户态的APC请求,此时检查当前的ApcMode
if ((ApcMode != KernelMode) && (Apc->KernelRoutine == PsExitSpecialApc)) {
Thread->ApcState.UserApcPending = TRUE;
// 用户态的APC请求,直接塞到队首
InsertHeadList(&ApcState->ApcListHead[ApcMode],
&Apc->ApcListEntry);

} else {
// 内核态的APC,塞到队尾
InsertTailList(&ApcState->ApcListHead[ApcMode],
&Apc->ApcListEntry);
}

} else {
ListEntry = ApcState->ApcListHead[ApcMode].Blink;
while (ListEntry != &ApcState->ApcListHead[ApcMode]) {
ApcEntry = CONTAINING_RECORD(ListEntry, KAPC, ApcListEntry);
if (ApcEntry->NormalRoutine == NULL) {
break;
}

ListEntry = ListEntry->Blink;
}
// 如果没有NormalRoutine,那么塞到所有没有NormalRoutine的内核态APC后
InsertHeadList(ListEntry, &Apc->ApcListEntry);
}
}

插入APC时,会检查NormalRoutine是否为空,不为空的时候,检查ApcMode,如果此时的APC不为KernelMode,并且此时的KernelRoutine定义成了PsExitSpecialApc,那么此APC会被视为是用户态的APC,该APC请求会直接插入到队首,否则的话,系统会决定将当前的APC请求插入到队伍尾部。如果没有设置NormalRoutine,那么此时的Apc就会被视为特殊的内核APC,被插入到内核中APC中所有没有NormalRoutine的APC的尾部。

Apc调用时机

在Windows中,Apc会在线程的Irq下降,或者系统调用、中断或异常处理结束的时候被触发。

1
2
3
4
5
6
7
8
9
10
11
_KiServiceExit:

cli ; disable interrupts
DISPATCH_USER_APC ebp, ReturnCurrentEax

;
; Exit from SystemService
;

EXIT_ALL NoRestoreSegs, NoRestoreVolatile

此处为每一个系统调用/中断/异常处理 都会经过的地方。当来到这个函数的时候,都会经过这个DISPATCH_USER_APC,而这个函数最终会经过的函数如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
;++
;
; Macro Description:
;
; This macro is called before returning to user mode. It dispatches
; any pending user mode APCs.
;
; Arguments:
;
; TFrame - TrapFrame
; interrupts disabled
;
; Return Value:
;
;--

DISPATCH_USER_APC macro TFrame, ReturnCurrentEax
local a, b, c
c:
.errnz (EFLAGS_V86_MASK AND 0FF00FFFFh)

test byte ptr [TFrame]+TsEflags+2, EFLAGS_V86_MASK/010000h ; is previous mode v86?
jnz short b ; if nz, yes, go check for APC
test byte ptr [TFrame]+TsSegCs,MODE_MASK ; is previous mode user mode?
jz a ; No, previousmode=Kernel, jump out
b: mov ebx, PCR[PcPrcbData+PbCurrentThread]; get addr of current thread
mov byte ptr [ebx]+ThAlerted, 0 ; clear kernel mode alerted
cmp byte ptr [ebx]+ThApcState.AsUserApcPending, 0
je a ; if eq, no user APC pending

mov ebx, TFrame
ifnb <ReturnCurrentEax>
mov [ebx].TsEax, eax ; Store return code in trap frame
mov dword ptr [ebx]+TsSegFs, KGDT_R3_TEB OR RPL_MASK
mov dword ptr [ebx]+TsSegDs, KGDT_R3_DATA OR RPL_MASK
mov dword ptr [ebx]+TsSegEs, KGDT_R3_DATA OR RPL_MASK
mov dword ptr [ebx]+TsSegGs, 0
endif

;
; Save previous IRQL and set new priority level
;
RaiseIrql APC_LEVEL
push eax ; Save OldIrql

sti ; Allow higher priority ints

;
; call the APC delivery routine.
;
; ebx - Trap frame
; 0 - Null exception frame
; 1 - Previous mode
;
; call APC deliver routine
;

stdCall _KiDeliverApc, <1, 0, ebx>

pop ecx ; (ecx) = OldIrql
LowerIrql ecx

ifnb <ReturnCurrentEax>
mov eax, [ebx].TsEax ; Restore eax, just in case
endif

cli
jmp b

ALIGN 4
a:
endm

这个DISPATCH_USER_APC函数首先会检查是否是v86模式,如果不是的话会检查当前的线程是否真的需要返回到用户态(也就是是不是为用户态的APC线程),不是的话就不会进入后面的调用过程;
之后会检查当前的线程中是否有用户态APC正准备执行(即是检查User APC Pending 标志位)如果没有的话也退出。
确认了会执行APC调用之后,首先将当年上下文保存(即所有的寄存器以及段寄存器),提升当前的IRQL至APC_LEVEL,最后尝试执行KiDeliverApc

这里注意,这里有一段判断:

1
AsUserApcPending

这个地方说明,只有用户态APC准备执行的时候,整个APC才会被执行。

以上准备流程通过,就会正式进入APC调用过程.

之后就会来到这个对于APC调用的这个函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
VOID
KiDeliverApc (
IN KPROCESSOR_MODE PreviousMode,
IN PKEXCEPTION_FRAME ExceptionFrame,
IN PKTRAP_FRAME TrapFrame
)

/*++

Routine Description:

This function is called from the APC interrupt code and when one or
more of the APC pending flags are set at system exit and the previous
IRQL is zero. All special kernel APC's are delivered first, followed
by normal kernel APC's if one is not already in progress, and finally
if the user APC queue is not empty, the user APC pending flag is set,
and the previous mode is user, then a user APC is delivered. On entry
to this routine IRQL is set to APC_LEVEL.

N.B. The exception frame and trap frame addresses are only guaranteed
to be valid if, and only if, the previous mode is user.

这里留意TrapFrame。这个参数可以称为自陷框架,相当于说是Windows在发生中断时,一个用于保持上下文的结构体。
这个函数非常长,我们分步骤来了解函数的流程

APC调用前以及之后的确认

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
    //
// If the thread was interrupted in the middle of the SLIST pop code,
// then back up the PC to the start of the SLIST pop.
//

if (TrapFrame != NULL) {
KiCheckForSListAddress(TrapFrame);
}

//
// Save the current thread trap frame address and set the thread trap
// frame address to the new trap frame. This will prevent a user mode
// exception from being raised within an APC routine.
//

Thread = KeGetCurrentThread();
OldTrapFrame = Thread->TrapFrame;
Thread->TrapFrame = TrapFrame;

//
// If special APC are not disabled, then attempt to deliver one or more
// APCs.
//

Process = Thread->ApcState.Process;
Thread->ApcState.KernelApcPending = FALSE;
if (Thread->SpecialApcDisable == 0) {

//
// If the kernel APC queue is not empty, then attempt to deliver a
// kernel APC.

// 细节见后方分析
}
CheckProcess:
if (Thread->ApcState.Process != Process) {
KeBugCheckEx(INVALID_PROCESS_ATTACH_ATTEMPT,
(ULONG_PTR)Process,
(ULONG_PTR)Thread->ApcState.Process,
(ULONG)Thread->ApcStateIndex,
(ULONG)KeIsExecutingDpc());
}

//
// Restore the previous thread trap frame address.
//

Thread->TrapFrame = OldTrapFrame;
return;

首先获取当前的前程对应的进程,并且将对应ApcState中欸等KernelApcPending设置为False,表示此时没有在等待的内核APC调用(因为此时会对当前所有的内核APC进行调用)。如果此线程不允许进行Apc调用的话,那么直接进入结束环节:检查当前的进程是否为ApcState对应进程,并且还原当前线程的陷阱帧(也就是这个线程的原上下文)

遍历当前的APC队列——内核态APC调用过程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
        KeMemoryBarrier();
while (IsListEmpty(&Thread->ApcState.ApcListHead[KernelMode]) == FALSE) {

//
// Raise IRQL to dispatcher level, lock the APC queue, and check
// if any kernel mode APC's can be delivered.
//
// If the kernel APC queue is now empty because of the removal of
// one or more entries, then release the APC lock, and attempt to
// deliver a user APC.
//

KeAcquireInStackQueuedSpinLock(&Thread->ApcQueueLock, &LockHandle);
NextEntry = Thread->ApcState.ApcListHead[KernelMode].Flink;
if (NextEntry == &Thread->ApcState.ApcListHead[KernelMode]) {
KeReleaseInStackQueuedSpinLock(&LockHandle);
break;
}

//
// Clear kernel APC pending, get the address of the APC object,
// and determine the type of APC.
//
// N.B. Kernel APC pending must be cleared each time the kernel
// APC queue is found to be non-empty.
//

Thread->ApcState.KernelApcPending = FALSE;
Apc = CONTAINING_RECORD(NextEntry, KAPC, ApcListEntry);
ReadForWriteAccess(Apc);
KernelRoutine = Apc->KernelRoutine;
NormalRoutine = Apc->NormalRoutine;
NormalContext = Apc->NormalContext;
SystemArgument1 = Apc->SystemArgument1;
SystemArgument2 = Apc->SystemArgument2;
if (NormalRoutine == (PKNORMAL_ROUTINE)NULL) {

//
// First entry in the kernel APC queue is a special kernel APC.
// Remove the entry from the APC queue, set its inserted state
// to FALSE, release dispatcher database lock, and call the kernel
// routine. On return raise IRQL to dispatcher level and lock
// dispatcher database lock.
//

RemoveEntryList(NextEntry);
Apc->Inserted = FALSE;
KeReleaseInStackQueuedSpinLock(&LockHandle);
(KernelRoutine)(Apc,
&NormalRoutine,
&NormalContext,
&SystemArgument1,
&SystemArgument2);

#if DBG

if (KeGetCurrentIrql() != LockHandle.OldIrql) {
KeBugCheckEx(IRQL_UNEXPECTED_VALUE,
KeGetCurrentIrql() << 16 | LockHandle.OldIrql << 8,
(ULONG_PTR)KernelRoutine,
(ULONG_PTR)Apc,
(ULONG_PTR)NormalRoutine);
}
#endif

}
} else {

//
// First entry in the kernel APC queue is a normal kernel APC.
// If there is not a normal kernel APC in progress and kernel
// APC's are not disabled, then remove the entry from the APC
// queue, set its inserted state to FALSE, release the APC queue
// lock, call the specified kernel routine, set kernel APC in
// progress, lower the IRQL to zero, and call the normal kernel
// APC routine. On return raise IRQL to dispatcher level, lock
// the APC queue, and clear kernel APC in progress.
//

if ((Thread->ApcState.KernelApcInProgress == FALSE) &&
(Thread->KernelApcDisable == 0)) {

RemoveEntryList(NextEntry);
Apc->Inserted = FALSE;
KeReleaseInStackQueuedSpinLock(&LockHandle);
(KernelRoutine)(Apc,
&NormalRoutine,
&NormalContext,
&SystemArgument1,
&SystemArgument2);

#if DBG

if (KeGetCurrentIrql() != LockHandle.OldIrql) {
KeBugCheckEx(IRQL_UNEXPECTED_VALUE,
KeGetCurrentIrql() << 16 | LockHandle.OldIrql << 8 | 1,
(ULONG_PTR)KernelRoutine,
(ULONG_PTR)Apc,
(ULONG_PTR)NormalRoutine);
}

#endif

if (NormalRoutine != (PKNORMAL_ROUTINE)NULL) {
Thread->ApcState.KernelApcInProgress = TRUE;
KeLowerIrql(0);
(NormalRoutine)(NormalContext,
SystemArgument1,
SystemArgument2);

KeRaiseIrql(APC_LEVEL, &LockHandle.OldIrql);
}

Thread->ApcState.KernelApcInProgress = FALSE;

} else {
KeReleaseInStackQueuedSpinLock(&LockHandle);
goto CheckProcess;
}
}

通过遍历当前线程的ApcState.ApcListHead从而将用户态APC和内核态APC都进行一次完整的遍历。首先检查当前的队列中是否有APC请求,有的话需要检查队列是否为空,然后会取出当前队列首部中的APC对象,之后会尝试检查当前APC对象中是否包含NormalRoutine(注意,此时是Kernel Mode下的APC请求,但是仍然需要检查NormalRoutine)

  • 如果此时未包含Normal Routine,则此时直接执行KernelRoutine的函数。
  • 如果此时包含,则会多检查一下当前的KernelApc是否处于正在执行的状态,并且此时内核的Apc是否被关闭。条件符合的时候,将当前的Apc从队列中摘除,修改当前的apc状态,并且首先调用内核态请求,然后检查KernelApcInProgress为False,以及KernelApcDisable未被设置为0,确保当前的NormalRoutine不为空,然后对调用当前的NormalRoutine,之后重新提高当前的IRQL,并且将KernelApcInProgress值为TRUE。

通过反复的遍历,最终会将当前线程中的内核态APC遍历完成。
综上所述,内核态APC调用发生条件如下:

  • 发生在系统调用/异常处理/中断处理过程中

  • 内核态APC队列不为空

  • KernelRoutine将全部被调用

  • NormalRoutine调用前,会检查

    • KernelApcInProgress为False,即当前未进行NormalRoutine调用
    • KernelApcDisable为0,即当前Apc未开启

注意:即使是内核态apc队列,也只有一个用户态函数(NornalRoutine)会在单次线程调用中被触发

用户态下的APC调用
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
//
// Kernel APC queue is empty. If the previous mode is user, user APC
// pending is set, and the user APC queue is not empty, then remove
// the first entry from the user APC queue, set its inserted state to
// FALSE, clear user APC pending, release the dispatcher database lock,
// and call the specified kernel routine. If the normal routine address
// is not NULL on return from the kernel routine, then initialize the
// user mode APC context and return. Otherwise, check to determine if
// another user mode APC can be processed.
//
// N.B. There is no race condition associated with checking the APC
// queue outside the APC lock. User APCs are always delivered at
// system exit and never interrupt the execution of the thread
// in the kernel.
//
if ((PreviousMode == UserMode) &&
(IsListEmpty(&Thread->ApcState.ApcListHead[UserMode]) == FALSE) &&
(Thread->ApcState.UserApcPending != FALSE)) {

//
// Raise IRQL to dispatcher level, lock the APC queue, and deliver
// a user mode APC.
//

KeAcquireInStackQueuedSpinLock(&Thread->ApcQueueLock, &LockHandle);

//
// If the user APC queue is now empty because of the removal of
// one or more entries, then release the APC lock and exit.
//

Thread->ApcState.UserApcPending = FALSE;
NextEntry = Thread->ApcState.ApcListHead[UserMode].Flink;
if (NextEntry == &Thread->ApcState.ApcListHead[UserMode]) {
KeReleaseInStackQueuedSpinLock(&LockHandle);
goto CheckProcess;
}

Apc = CONTAINING_RECORD(NextEntry, KAPC, ApcListEntry);
ReadForWriteAccess(Apc);
KernelRoutine = Apc->KernelRoutine;
NormalRoutine = Apc->NormalRoutine;
NormalContext = Apc->NormalContext;
SystemArgument1 = Apc->SystemArgument1;
SystemArgument2 = Apc->SystemArgument2;
RemoveEntryList(NextEntry);
Apc->Inserted = FALSE;
KeReleaseInStackQueuedSpinLock(&LockHandle);
(KernelRoutine)(Apc,
&NormalRoutine,
&NormalContext,
&SystemArgument1,
&SystemArgument2);

if (NormalRoutine == (PKNORMAL_ROUTINE)NULL) {
KeTestAlertThread(UserMode);

} else {
KiInitializeUserApc(ExceptionFrame,
TrapFrame,
NormalRoutine,
NormalContext,
SystemArgument1,
SystemArgument2);
}
}

与内核态apc调用最大的不同在于,用户态的apc调用没有用一个大大的while循环包括,这意味着仅仅只有一个用户态Apc会在这个时候被调用。用户态APC调用和内核态APC中的NormalRoutine调用差不多,也要检查以下条件

  • UserMode下的Apc队列不为空
  • UserApcPending不为FALSE,即是当前有用户态APC正在挂起等待

之后也如Kernel mode一样,调用KernelRoutine,并且检查NermalRoutine是否为空,如果为空的话将当前线程设置为alertable状态,其实也就是将UserApcPending设置为TRUE(也就是前文提到过的,处于这个状态下的线程才会触发APC),如果不为空,这调用KiInitilizeUserApc函数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
VOID
KiInitializeUserApc (
IN PKEXCEPTION_FRAME ExceptionFrame,
IN PKTRAP_FRAME TrapFrame,
IN PKNORMAL_ROUTINE NormalRoutine,
IN PVOID NormalContext,
IN PVOID SystemArgument1,
IN PVOID SystemArgument2
)
{

PCONTEXT ContextRecord;
EXCEPTION_RECORD ExceptionRecord;
PMACHINE_FRAME MachineFrame;

//
// Transfer the context information to the user stack, initialize the
// APC routine parameters, and modify the trap frame so execution will
// continue in user mode at the user mode APC dispatch routine.
//

try {

//
// If the exception frame address is NULL, then the context copy
// has been bypassed and the context is already on the user stack.
//

if (ExceptionFrame == NULL) {

//
// The address of the already copied context record and the real
// exception frame address are passed via unused fields in the
// trap frame.
//
// N.B. The context record has been probed for write access.
//

ContextRecord = (PCONTEXT)TrapFrame->ContextRecord;
ExceptionFrame = (PKEXCEPTION_FRAME)TrapFrame->ExceptionFrame;

} else {

//
// Compute address of aligned machine frame, compute address of
// context record, and probe user stack for writeability.
//

// 这个结构存放了自陷框架中的临时存放的rip/rsp
MachineFrame =
(PMACHINE_FRAME)((TrapFrame->Rsp - sizeof(MACHINE_FRAME)) & ~STACK_ROUND);

ContextRecord = (PCONTEXT)((ULONG64)MachineFrame - sizeof(CONTEXT));
ProbeForWriteSmallStructure(ContextRecord,
sizeof(MACHINE_FRAME) + CONTEXT_LENGTH,
STACK_ALIGN);

//
// Move machine state from trap and exception frames to the context
// record on the user stack.
//

ContextRecord->ContextFlags = CONTEXT_FULL | CONTEXT_DEBUG_REGISTERS;
KeContextFromKframes(TrapFrame, ExceptionFrame, ContextRecord); // 此处将当前的Frame存放在了Context位置上

//
// Fill in machine frame information.
// 此时的MachineFrame存放的是用户空间的上下文

MachineFrame->Rsp = ContextRecord->Rsp;
MachineFrame->Rip = ContextRecord->Rip;
}

//
// Initialize the user APC parameters.
//

ContextRecord->P1Home = (ULONG64)NormalContext;
ContextRecord->P2Home = (ULONG64)SystemArgument1;
ContextRecord->P3Home = (ULONG64)SystemArgument2;
ContextRecord->P4Home = (ULONG64)NormalRoutine;

//
// Set the address new stack pointer in the current trap frame and
// the continuation address so control will be transferred to the user
// APC dispatcher.
//

TrapFrame->Rsp = (ULONG64)ContextRecord;
TrapFrame->Rip = (ULONG64)KeUserApcDispatcher;

} except (KiCopyInformation(&ExceptionRecord,
(GetExceptionInformation())->ExceptionRecord)) {

//
// Lower the IRQL to PASSIVE_LEVEL, set the exception address to
// the current program address, and raise an exception by calling
// the exception dispatcher.
//
// N.B. The IRQL is lowered to PASSIVE_LEVEL to allow APC interrupts
// during the dispatching of the exception. The current thread
// will be terminated during the dispatching of the exception,
// but lowering of the IRQL is required to enable the debugger
// to obtain the context of the current thread.
//

KeLowerIrql(PASSIVE_LEVEL);
ExceptionRecord.ExceptionAddress = (PVOID)(TrapFrame->Rip);
KiDispatchException(&ExceptionRecord,
ExceptionFrame,
TrapFrame,
UserMode,
TRUE);
}

return;
}

根据当前的ContextRecord,这个函数会决定当前的TrapFrame中的ContextRecord存放的内容。

这里会看到,函数最后会回到_KiServiceExit。此处并没有修改返回值,为什么呢?关键就在这个TrapFrame结构体里面。

TrapFrame

这个结构体是Windows系统调用的一个重要的结构体,这个结构体会依据当前进入内核的原因,可以分别成为异常,中断和自陷,也就是异常处理,cpu中断,系统调用这些过程中会触发,这个结构整体是PKTRAP_FRAME。结构体中存储了进入内核前,用户态下的所有寄存器(即当前的执行的上下文)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
typedef struct _KTRAP_FRAME {

//
// Home address for the parameter registers.
//

ULONG64 P1Home;
ULONG64 P2Home;
ULONG64 P3Home;
ULONG64 P4Home;
ULONG64 P5;

//
// Previous processor mode (system services only) and previous IRQL
// (interrupts only).
//

KPROCESSOR_MODE PreviousMode;
KIRQL PreviousIrql;

//
// Page fault load/store indicator.
//

UCHAR FaultIndicator;

//
// Exception active indicator.
//
// 0 - interrupt frame.
// 1 - exception frame.
// 2 - service frame.
//

UCHAR ExceptionActive;

//
// Floating point state.
//

ULONG MxCsr;

//
// Volatile registers.
//
// N.B. These registers are only saved on exceptions and interrupts. They
// are not saved for system calls.
//

ULONG64 Rax;
ULONG64 Rcx;
ULONG64 Rdx;
ULONG64 R8;
ULONG64 R9;
ULONG64 R10;
ULONG64 R11;

//
// Gsbase is only used if the previous mode was kernel.
//
// GsSwap is only used if the previous mode was user.
//

union {
ULONG64 GsBase;
ULONG64 GsSwap;
};

//
// Volatile floating registers.
//
// N.B. These registers are only saved on exceptions and interrupts. They
// are not saved for system calls.
//

M128A Xmm0;
M128A Xmm1;
M128A Xmm2;
M128A Xmm3;
M128A Xmm4;
M128A Xmm5;

//
// Page fault address or context record address if user APC bypass.
//

union {
ULONG64 FaultAddress;
ULONG64 ContextRecord;
ULONG64 TimeStamp;
};

//
// Debug registers.
//

ULONG64 Dr0;
ULONG64 Dr1;
ULONG64 Dr2;
ULONG64 Dr3;
ULONG64 Dr6;
ULONG64 Dr7;

//
// Special debug registers.
//
// N.B. Either AMD64 or EM64T information is stored in the following locations.

union {
struct {
ULONG64 DebugControl;
ULONG64 LastBranchToRip;
ULONG64 LastBranchFromRip;
ULONG64 LastExceptionToRip;
ULONG64 LastExceptionFromRip;
};

struct {
ULONG64 LastBranchControl;
ULONG LastBranchMSR;
};
};

//
// Segment registers
//

USHORT SegDs;
USHORT SegEs;
USHORT SegFs;
USHORT SegGs;
//
// Previous trap frame address.
//
ULONG64 TrapFrame;
//
// Saved nonvolatile registers RBX, RDI and RSI. These registers are only
// saved in system service trap frames.
//
ULONG64 Rbx;
ULONG64 Rdi;
ULONG64 Rsi;
//
// Saved nonvolatile register RBP. This register is used as a frame
// pointer during trap processing and is saved in all trap frames.
//
ULONG64 Rbp;
//
// Information pushed by hardware.
//
// N.B. The error code is not always pushed by hardware. For those cases
// where it is not pushed by hardware a dummy error code is allocated
// on the stack.
//
union {
ULONG64 ErrorCode;
ULONG64 ExceptionFrame;
};

ULONG64 Rip;
USHORT SegCs;
USHORT Fill1[3];
ULONG EFlags;
ULONG Fill2;
ULONG64 Rsp;
USHORT SegSs;
USHORT Fill3[1];

//
// Copy of the global patch cycle at the time of the fault. Filled in by the
// invalid opcode and general protection fault routines.
//
LONG CodePatchCycle;
} KTRAP_FRAME, *PKTRAP_FRAME;

当完成了KeDeliverAPC之后,整个_Ki 就会结束,此时就会离开系统调用来到用户态。正常情况下TrapFrame->Rip会指向原先的地址地址,而此时却被修改成了KeUserApcDispatcher,因而此时的用户态APC获得了执行机会

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/*函数原型摘自ReactOS*/
PUBLIC KiUserApcDispatcher
.PROC KiUserApcDispatcher
.endprolog
/* We enter with a 16 byte aligned stack */

mov rcx, [rsp + CONTEXT_P1Home] /* NormalContext */
mov rdx, [rsp + CONTEXT_P2Home] /* SystemArgument1 */
mov r8, [rsp + CONTEXT_P3Home] /* SystemArgument2 */
lea r9, [rsp] /* Context */
call qword ptr [rsp + CONTEXT_P4Home] /* NormalRoutine */

/* NtContinue(Context, TRUE); */
lea rcx, [rsp]
mov dl, 1
call NtContinue

nop
int 3
.ENDP

这个地方翻译一下,就是调用了

1
NormalRoutine(NromalContext, SystemArgument1, SystemArgument2, Context );

Apc调用循环与最终退出

通过修改TrapFrame的方式,巧妙的让内核调用返回用户态的时候进入了用户态Apc的分发函数。在函数调用结束的时候,会来到一个叫做NtContinue的函数上,这个函数的内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
;++
;
; NTSTATUS
; NtContinue (
; IN PCONTEXT ContextRecord,
; IN BOOLEAN TestAlert
; )
;
; Routine Description:
;
; This routine is called as a system service to continue execution after
; an exception has occurred. Its function is to transfer information from
; the specified context record into the trap frame that was built when the
; system service was executed, and then exit the system as if an exception
; had occurred.
;
; WARNING - Do not call this routine directly, always call it as
; ZwContinue!!! This is required because it needs the
; trapframe built by KiSystemService.
;
; Arguments:
;
; KTrapFrame (ebp+0: after setup) -> base of KTrapFrame
;
; ContextRecord (ebp+8: after setup) = Supplies a pointer to a context rec.
;
; TestAlert (esp+12: after setup) = Supplies a boolean value that specifies
; whether alert should be tested for the previous processor mode.
;
; Return Value:
;
; Normally there is no return from this routine. However, if the specified
; context record is misaligned or is not accessible, then the appropriate
; status code is returned.
;
;--

NcTrapFrame equ [ebp + 0]
NcContextRecord equ [ebp + 8]
NcTestAlert equ [ebp + 12]

align dword
cPublicProc _NtContinue ,2

push ebp

;
; Restore old trap frame address since this service exits directly rather
; than returning.
;

mov ebx, PCR[PcPrcbData+PbCurrentThread] ; get current thread address
mov edx, [ebp].TsEdx ; restore old trap frame address
mov [ebx].ThTrapFrame, edx ;

;
; Call KiContinue to load ContextRecord into TrapFrame. On x86 TrapFrame
; is an atomic entity, so we don't need to allocate any other space here.
;
; KiContinue(NcContextRecord, 0, NcTrapFrame)
;

mov ebp,esp
mov eax, NcTrapFrame
mov ecx, NcContextRecord
stdCall _KiContinue, <ecx, 0, eax>
or eax,eax ; return value 0?
jnz short Nc20 ; KiContinue failed, go report error

;
; Check to determine if alert should be tested for the previous processor mode.
;

cmp byte ptr NcTestAlert,0 ; Check test alert flag
je short Nc10 ; if z, don't test alert, go Nc10
mov al,byte ptr [ebx]+ThPreviousMode ; No need to xor eax, eax.
stdCall _KeTestAlertThread, <eax> ; test alert for current thread
Nc10: pop ebp ; (ebp) -> TrapFrame
mov esp,ebp ; (esp) = (ebp) -> trapframe
jmp _KiServiceExit2 ; common exit

Nc20: pop ebp ; (ebp) -> TrapFrame
mov esp,ebp ; (esp) = (ebp) -> trapframe
jmp _KiServiceExit ; common exit

stdENDP _NtContinue

NTSTATUS
KiContinue (
IN PCONTEXT ContextRecord,
IN PKEXCEPTION_FRAME ExceptionFrame,
IN PKTRAP_FRAME TrapFrame
)
/*++
Routine Description:

This function is called to copy the specified context frame to the
specified exception and trap frames for the continue system service.

Arguments:
ContextRecord - Supplies a pointer to a context record.
ExceptionFrame - Supplies a pointer to an exception frame.
TrapFrame - Supplies a pointer to a trap frame.
Return Value:

STATUS_ACCESS_VIOLATION is returned if the context record is not readable
from user mode.
STATUS_DATATYPE_MISALIGNMENT is returned if the context record is not
properly aligned.
STATUS_SUCCESS is returned if the context frame is copied successfully
to the specified exception and trap frames.
--*/
{

KIRQL OldIrql;
NTSTATUS Status;
//
// Synchronize with other context operations.
//
// If the current IRQL is less than APC_LEVEL, then raise IRQL to APC level.
//
OldIrql = KeGetCurrentIrql();
if (OldIrql < APC_LEVEL) {
KfRaiseIrql(APC_LEVEL);
}
//
// If the previous mode was not kernel mode, then use wrapper function
// to copy context to kernel frames. Otherwise, copy context to kernel
// frames directly.
//
Status = STATUS_SUCCESS;
if (KeGetPreviousMode() != KernelMode) {// 说明当前的系统调用是从用户态发生的
try {
KiContinuePreviousModeUser(ContextRecord,
ExceptionFrame,
TrapFrame);

} except(EXCEPTION_EXECUTE_HANDLER) {
Status = GetExceptionCode();
}

} else {
KeContextToKframes(TrapFrame,
ExceptionFrame,
ContextRecord,
ContextRecord->ContextFlags,
KernelMode);
}
//
// If the old IRQL was less than APC level, then lower the IRQL to its
// previous value.
//
if (OldIrql < APC_LEVEL) {
KeLowerIrql(OldIrql);
}
return Status;
}

可以看到,函数做了三件事情

  • 将当前的ContextRecord赋值成TrapFrame
  • 将当前线程置为Test alertable,此时线程再次进入可以执行APC的状态
  • 调用KiServiceExit2,系统调用退出环节。

NtContinue这个函数中,首先会将之前的TrapFrame保存,然后调用函数KiContinue。这个函数会检测当前的Context是否是来自用户空间的,如果是的话会调用KicontinuePreviuseModeUser将当前的空间复制到内核态,然后调用KeContextToKframes将当前的ContextRecord赋值成TrapFrame,相当于是恢复了真正的调用上下文。然后会进入KiServiceExit2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
;++
;
; _KiServiceExit2 - same as _KiServiceExit BUT the full trap_frame
; context is restored
;
;--
public _KiServiceExit2
_KiServiceExit2:

cli ; disable interrupts
DISPATCH_USER_APC ebp

;
; Exit from SystemService
;

EXIT_ALL ; RestoreAll

这个函数本质上和KiServiceExit做的事情是一样的,出了最后退出System Service的时候,少了声明NoRestoreSegs, NoRestoreVolatile,因为此时的Context已经在之前的NtContinue中得到了赋值。同时,这个函数也会调用DISPATCH_USER_APC,这就意味着此时未执行的用户态APC会被继续执行,直到APC队列为空

Apc调用总结

APC结构体

APC分为两种:内核态APC和用户态APC
内核态APC也分两种:有NormalRoutine和没有NromalRoutine
但是无论那种APC都包含·KernelRoutine
在内存中以双向链表的形式存在,大致如下

当前的线程中保存了一个ApcState和一个SaveApcState,分别记录了创建线程所在进程的APC请求,以及线程所挂靠的进程的APC请求

APC调用流程

时机

通用条件:当发生系统调用/异常处理/中断的时候 ,并且此时存在用户态APC

内核态APC中的NormalRoutineApcState->KernelInProcess == FALSE,也即当前没有其他内核态程序执行

用户态APC整个调用时机:UserApcPending不为FALSE,即是当前有用户态APC正在挂起等待,也即是线程处于Alertable状态

调用顺序

1
内核态无NormalRoutineAPC -> 内核态有NormalRoutineAPC -> 用户态APC

先调用KernelRoutine再调用NormalRoutine

调用方法

用户态APC

插入APC

1
2
3
4
5
6
NtQueueApcThread  = (NTSTATUS(NTAPI *)(HANDLE, PVOID, PVOID, PVOID, ULONG)) GetProcAddress(hNtdll, "NtQueueApcThread");
if (NtQueueApcThread == NULL) {
std::cout << "Could not get NtQueueApcThread" << std::endl;
}
HANDLE hThread = GetCurrentThread();
NtQueueApcThread(hThread, &ApcTest, 0, 0, 0);

调用APC时机:等待线程变为alertable

内核态APC

内核态APC只能在内核态下完成注入,所以这个时候需要借助driver帮我们实现,具体流程可以如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// 首先获取所要注入的线程的句柄
st = ObReferenceObjectByHandle (ThreadHandle,
THREAD_SET_CONTEXT,
PsThreadType,
Mode,
&Thread,
NULL);
if (NT_SUCCESS (st)) {
st = STATUS_SUCCESS;
if (IS_SYSTEM_THREAD (Thread)) {
st = STATUS_INVALID_HANDLE;
} else {
// 分配一个内存区域
Apc = ExAllocatePoolWithQuotaTag (NonPagedPool | POOL_QUOTA_FAIL_INSTEAD_OF_RAISE,
sizeof(*Apc),
'pasP');

if (Apc == NULL) {
st = STATUS_NO_MEMORY;
} else {
// 初始化一个APC对象,插入到当前线程中,此时设置PspQueueSpecialApc(内核线程)以及ApcRoutine(用户态),以及ApcArgument1(真实要调用的用户态的)
KeInitializeApc (Apc,
&Thread->Tcb,
OriginalApcEnvironment,
PspQueueApcSpecialApc,
NULL,
(PKNORMAL_ROUTINE)ApcRoutine,
UserMode,
ApcArgument1);
// 然后将当前的APC插入等待队列中
if (!KeInsertQueueApc (Apc, ApcArgument2, ApcArgument3, 0)) {
ExFreePool (Apc);
st = STATUS_UNSUCCESSFUL;
}
}
}
ObDereferenceObject (Thread);
}

之后等待线程发生系统调用/中断调用/异常等会陷入到内核的过程即可。

后记

这篇文章是自毕业前开始写,一直写到了毕业后。从毕业前完成研究->9月份前完成研究->过年前完成研究这样不停的咕咕咕,最后终于在3月份前完成了。。实在也是不容易。中途参考了各种各样的书和代码(不知道为啥我看到的wrk和《Windows 内核情景分析》的不太一样。。。只好硬着头皮按照我自己的思路来写了),第一次正儿八经的正向研究内核,中途放弃了好多次。后来工作了半年,中途也对Windows内核有了一定的了解,慢慢发现有些东西和这个apc调用串了起来,于是又重新捡起来看,这次终于也是看完了。
就目前的水平来说,可能分析的不太完整,以后更加熟悉内核,估计也会回来这边进行一定的修改(大约不会咕掉?)

参考文章

Windows内核情景分析 5.8 Windows的APC机制
http://www.weixianmanbu.com/article/33.html