This blog describes the attempts, as part of the Payatu Research Team, to fuzz and find vulnerabilities in Windows Kernel.
We start the blog by describing our necessity for a new fuzzer, the considerations of reusing an existing open-source code, the process of modifying this existing code to suit our needs and the complexity of the entire setup once the modification was complete.
Overall, we intend the series to be an illustration of the work done.
Necessity is the mother of invention, and we were in need of a fuzzer, specifically, we wanted to fuzz the Windows Kernel.
But we already had a proprietary general purpose fuzzer, so the initial thought was to use the existing framework to target Windows. On further investigation however, we faced some issues to use this framework to target the Windows Kernel.
The existing fuzzer was designed with user-mode applications in mind and as such was not equipped to handle all the complexities that come with fuzzing any operating system kernel.
When fuzzing, a piece of software is needed to monitor the target for any crashes. Monitoring OS kernels is different than that for user-mode applications. For kernels, the monitor needs to be completely outside kernel environment as opposed to user-mode apps, wherein the monitor can run alongside the target in the same environment.
Also, the fuzzer needs to be able to restart the target after a crash. This was comparatively easier for user-mode apps in the existing framework. For kernels, on the other hand, it would have required some major changes to the framework.
So, the choice was between modifying our existing general purpose fuzzing framework or deploying a completely different fuzzer for targeting only Windows Kernel.
It seemed simpler to deploy a separate Windows Kernel Fuzzer. Besides, from a strategy standpoint, it could be better to have separate fuzzing engines with their own data generators and monitors. This would decrease our need to be reliant on only one engine.
The Choice: To or Not to Re-use
Having made the decision to have a separate fuzzing engine for Windows Kernel, options were available. We could re-use some existing open source fuzzer with good reviews, or another engine could be built from scratch.
It was decided to re-use an open source fuzzer, because writing one from scratch seemed a waste of resources. At the time, one such engine, the Syzkaller had good reviews and the researchers at Checkpoint had managed to deploy it for fuzzing Windows Kernel. While they had not released their modified code, they had created Bugs on the Windshield for modifying the original Syzkaller detailing their efforts.
It became my task to deploy our version of the modified Syzkaller as per this Bugs on the Windshield guide.
The modified syzkaller utilized parts of two distinct fuzzers – original syzkaller and kAFL.
The parts from kAFL fuzzer were used mainly to add Windows Kernel coverage capabilities to the final modified version.
We could have used kAFL directly without any major modifications; there’s a caveat though – kAFL is not structure aware and as such, can only target limited parts of kernels like file system drivers.
There were four important parts of the modified Syzkaller –
1. The Linux Host Kernel (KVM) –
* VMX operations (hypercalls) for kernel crash monitoring, driver information etc.
* Intel Processor Trace for coverage-related information.
2. Qemu Virtual Machines
3. Windows Program Executor and Logger
4. Windows Crash Symbolizer
The Linux Host Kernel (KVM)
The modification to Linux kernel was straight forward, I simply had to use the KVM patches given in the kAFL GitHub.
The challenge, however, was that the KVM patches mentioned are relevant to Linux kernel v4.6.2. At the time of building our modified version, the available Linux kernel had moved on very much further.
Hence, now the choice was to either port the changes mentioned in these patches to the very much advanced kernel available or to install the older v4.6.2 kernel. Both approaches posed their own set of obstacles. For porting the changes to newer kernel version, I did not have that much expertise to understand the Linux kernel code and then make changes as needed. On the other hand, to install the older kernel on a newer Linux version meant some features may not work, some hardware may not become available.
Considering the challenges, I opted to go with building the older kernel for the new Linux (Ubuntu) distro. I had some trouble figuring out the exact steps needed to build and deploy the kernel, as there are so many different articles which mention various methods for building the Linux kernel. After much trial and error, I finally found a method which worked for the configuration we were having.
As thought earlier, using the older kernel version did cause a loss of some features, most notable was the loss of ability to use the ethernet port. Now the ethernet device was not working and since we wanted the fuzzing setup to be available over our internal network, I now had to figure out a way to connect the machine. Searching for device availability, the machine was recognizing USB ports and USB devices. As a result, I obtained a network connection to the machine from a USB-to-Ethernet dongle.
These modifications also involve Intel Processor Trace (IPT) technology to gather coverage information. Windows kernel being closed source, it is difficult to add instrumentation to monitor coverage. IPT helps in computing coverage information by obtaining trace of addresses/instructions which the Intel CPU executes. A bitmap of addresses hit/not hit is then developed and used by the program generator to generate programs which increase the coverage.
As IPT solely is supported by specific Intel CPUs, this port of Syzkaller could only be executed on bare metal systems having the necessary Intel architecture.
Qemu Virtual Machines
The Syzkaller modification done by Checkpoint researchers, combined parts of kAFL with the original Syzkaller. These parts include using Qemu virtualization.
I decided to keep using Qemu in our port, mainly because the modifications were already available on the kAFL GitHub repo in the form of patches. Also, Qemu gives a lot more control over virtualization as compared to other virtualization softwares like Virtual Box and VMWare. Chief among these added benefits is the ability to make custom hypercalls (VMX operations) from inside the guest OS. These hypercalls can then be monitored and handled using Qemu itself.
As much as the added advantage of hypercalls helped, getting the Qemu machines’ network to work was a headache. Firstly, the default ‘user’ backend used by Qemu does not allow communication from host to guest or between distinct guests. For example, doing ssh/ping into a Qemu guest from the host does not work, when using the ‘user’ network backend. This was a serious obstacle, as Syzkaller relies on ssh to transfer binaries and data to and from the guest machine.
To solve this, I used host port forwarding available in Qemu, which connects ports on hosts to ports on Qemu guest. As a result, now I could configure our port of Syzkaller to ssh into the localhost but use the port which was forwarded to guest port 22.
As will be seen later, for using symbolizer, I had to abandon this workaround. In short, a network communication between multiple Qemu guests was needed and the host port forwarding approach seemed like a general overhead. I then shifted to TUN/TAP networking, in which case port forwarding is not required, instead one virtual TAP interface is created on host for each distinct guest. In this networking mode, the host-to-guest and guest-to-guest communication was achieved without any additional overhead.
Windows Program Executor and Logger
While Syzkaller can execute the generated programs on most target OS’es, this was not true for Windows. This was chiefly, because the original executor uses some APIs that local to UNIX/Linux based OS’es and as such are not available on Windows.
To solve this, I modified most of the executor code to be able to run on Windows and wrote wrappers for those UNIX/Linux APIs like mmap.
In any fuzzing engine, an ability is required to log the generated data and computed output. In our port of Syzkaller, the original logger used by Syzkaller was not properly logging the executed program. Therefore, I configured the executor itself to log the generated program. This log is then automatically extracted using a simple scp command in the event of a crash.
Windows Crash Symbolizer
Unlike llvm symbolizer, which can symbolize a crash in certain applications, there was no way to directly symbolize a crash found by our port of Syzkaller.
One way was to use another Windows Qemu guest, the Debug Server, to run a remote kernel debugger and obtain symbolized crash log from this. This approach failed, as the remote WinDbg kernel debugging session got randomly stuck and could not get the crash details from the target.
To get over this, I reverse engineered the code in WinDbg DBGHELP.DLL and DBGENG.DLL. Using knowledge obtained from this, I created a simple lightweight Windows Kernel Debugger Binary which when executed on the Debug Server, was able to create symbolized crash logs without getting randomly stuck.
Complexity, Lessons Learnt
The final Syzkaller port, though it works, is an extraordinarily complex engine that still needs some more work. The result of such a complex framework is that the bulk of efforts are spent mostly on development and not on maintaining/monitoring the data generation/fuzzing attempts. In a test run, we were able to catch a well-known Windows 10 bug, CVE-2021-1732, but the time taken was too long due to all the complexity involved.
From a management perspective, this is an interesting turn of events. A hefty part of the total efforts was spent to modify and deploy our port of Syzkaller. On the other hand, it was found to be complex and slow. Here, the choices available to us were to continue to use this port or abandon it entirely and work on something else.
We opted to continue using this Syzkaller port and as an alternative, a less complex and far simpler Windows Kernel (WINK) fuzzer was developed. To reduce complexity and efforts needed, WINK was supposed to use python and should be independent of virtualization environments. WINK is by no means complete, although it was able to reproduce known bugs in a shorter period.
We, the Payatu Research Team, felt a need to diversify into Windows Kernel fuzzing. While we do have a proprietary general purpose fuzzing engine, it works better for user-mode applications. Therefore, towards the purpose of having a Windows Kernel fuzzing engine, we created a port of the well-known Syzkaller. During this process, several obstacles were encountered, and challenges faced.
These challenges were overcome with some creativity, to finally have a working Syzkaller port that can fuzz the Windows Kernel. Although working, this port is highly complex. We also have a less complex WINK fuzzer which runs on python and is independent of virtualization environments.