Monday, 7 March 2016

How MAC Address Randomization Works on Windows 10

When Apple announced its devices would use random MAC addresses when searching for Wi-Fi networks, it received extensive media attention. And rightly so. It prevents companies from tracking your movements, and Apple was the first major player to start doing this. Windows and Android are quietly trying to catch up. As a result, some devices running Windows now support MAC address randomization, and we will discuss how it's implemented, and where it fails. This information is a small selection from the recent paper Why MAC Address Randomization is not Enough: An Analysis of Wi-Fi Network Discovery Mechanisms.

Update: we have contacted Microsoft, and they are in the process of addressing the vulnerabilities we discovered.

How it works

Microsoft first added support for MAC address randomization in Windows 10. Unfortunately, it's only available if you have a WiFi card and driver that support it. For example, the Intel 7265 AC, when using the latest driver, supports randomization [1]. You can see if your hardware supports MAC address randomization by going through the following menus:


If your hardware supports MAC address randomization, you will see the following option at the top of the window:



As you can see, I have it enabled on my laptop. So far it's been working quite well. What's very interesting about Microsoft's approach is that it also uses random MAC addresses when connecting to a wireless networks. In contrast, Apple only uses random addresses when searching for nearby networks, and it falls back to its original address when connecting to a network. In this aspect Windows 10 offers better privacy than Apple.

Using a random MAC address to connect to a network can cause problems if users are authenticated (i.e., recognized) based on their MAC address [2]. Interestingly, Windows avoids this issue by always using the same random address every time it connect to a specific network. For example, let's say you want to pay for Wi-Fi access, and they authenticate you based on your MAC address. Then this is not a problem. The first time you connect, Windows will generate a random MAC address. And if you reconnect to this network at a later point in time, Windows will reuse the previously generated address. Therefore the system can still recognize you, and you don't have to pay twice. There's one downside to this approach: since you always use the same address when connecting to a particular network, an adversary can learn when certain devices connect to specific networks. Nevertheless, compared to the old situation where you'd always use the original MAC address, it improves your privacy.

Technically, the random MAC address that is used to connect to a network is calculated as [5]:
address = SHA-256(SSID, real MAC address, connectionId, secret)[:6]
Here SSID is the name of the network you are connecting to, real MAC address the original address of your network interface, and connectionId is a value that changes every time the user removes (and re-adds) the network (i.e., this value is updated if you "forget" the network under Windows 10). The secret parameter is a 256-bit cryptographic random number, generated during system initialization, and kept the same across reboots. Every interface has a different value of the secret parameter, to assure each interface gets different random MAC address. Finally, bits in the most significant byte of address are set so it becomes a locally administered, unicast address. While the presentation by Huitema partly described this process, our paper is the first to describe this formula in full detail.

It's also possible to disable randomization for certain networks. In this case Windows will use the original address when connecting to a network. You can configure this through the following settings when you are currently connected to the network:

Notice that the user has three options for each specific network:
  • On: the same random MAC address is always used when connecting to this network.
  • Off: the original MAC address is used.
  • Change daily: every day a new random MAC address is used.
Remark that if randomization is enabled, independent of the above options, Windows 10 will always use random MAC addresses when scanning for nearby networks. This "scanning" address changes every time you connect (and disconnect) from a network, and when you restart your device [3]. Hence it doesn't change that frequently, but it's still sufficient to prevent tracking over extended periods of time. In contrast, Apple changes the scanning address roughly every few minutes, which provides more privacy.

Basic Security Analysis

Randomization as implemented in Windows 10 significantly improves your privacy. So enable it! Unfortunately, it's not perfect, because there are still some ways to defeat or bypass it.

The first weakness is that the sequence number contained in WiFi frames is not reset when changing the (random) MAC address. This sequence number, which is present in most Wi-Fi frames, is used to detected retransmissions, and is incremented by one after successfully transmitting a frame. As shown in the picture below, when the MAC address changes because the user connects to a network, the sequence counter is not reset:


The last frame from ea:69:0a:* has the sequence number of 92, and the other address 7c:5c:f8:* has the sequence number 94. Based on this an adversary can reasonably conclude that both frames are sent by the same device. In other words, he learns that the same device was using both addresses, defeating the purpose of address randomization.

The second problem is that Windows 10 reveals its real MAC address when interacting with Hotspot 2.0 networks. But what's Hotspot 2.0? Simply put, Hotspot 2.0 is a new standard to automatically and securely roam between WiFi networks. No manual interaction is needed. Your device automatically determines whether you have the appropriate credentials (passwords) to connect to a network. Think of this like the cellular network: when you get off the plane, your phone automatically finds and connects to a foreign cellular network. Hotspot 2.0 provides a similar experience for WiFi.

In order to accomplish automatic roaming, Hotspot 2.0 sends ANQP queries to the Access Point before connecting to it. These ANQP queries request detailed information about the wireless network. This information includes the credentials that are needed to connect with the hotspot, whether the hotspot provides internet access or only local network access, etc. Unfortunately, Windows 10 sends these ANQP queries using the real (original) MAC address:


In the first probe request it uses the random MAC address 2a:b3:e6:*. These probe requests are used to detect the presence of networks. If there's a Hotspot 2.0 network nearby, Windows will send ANQP requests using the real MAC address, in this case 7c:5c:f8:*. Therefore an attacker can obtain your real MAC address by advertising a Hotspot 2.0 network. Thankfully, Windows 10 only sends ANQP queries if at least one Hotspot 2.0 is configured. Since this standard is not yet widely deployed, few users will have such a network configured [4].

Detailed Security Analysis

Want to know all flaws that are present in existing implementations of MAC address randomization? And this specifically for Android, Apple, Linux, and Windows? Then read my paper Why MAC Address Randomization is not Enough: An Analysis of Wi-Fi Network Discovery Mechanisms [5]! It has everything explained in technical detail.


References and Footnotes

[1] If you have an Intel 7260 AC, you can also force Windows 10 to use the drivers of the Intel 7265 AC. Your device will still work, and will support MAC address randomization.
[2] Even though authentication based on the MAC address is utterly insecure (an adversary can easily spoof a MAC address), it's still used by many systems.
[3] C. Huitema. Personal communication, Nov. 2015.
[4] One notable exception is the Passpoint configuration provided by Boingo. Essentially Passpoint is a synonym of Hotspot 2.0. If you have this configuration installed, you have a Hotspot 2.0 capable device, and the Boingo configuration will use Hotspot 2.0. This means Windows will send ANQP queries to nearby Hotspot 2.0 networks.
[5] M. Vanhoef, C. Matte, M. Cunche, L. S. Cardoso, and F. Piessens. Why MAC Address Randomization is not Enough: An Analysis of Wi-Fi Network Discovery Mechanisms (AsiaCCS 2016).

Thursday, 15 October 2015

Advanced WiFi Attacks Using Commodity Hardware

The WiFi protocol assumes all clients behaves fairly. This means a station will give others a chance to transmit packets, before it starts transmitting itself. It's known that with a software defined radio such as a USRP, a user can implement the WiFi protocol themselves and not follow these rules.

 A desktop computer, antenna, two USRPs on top of each other, and a signal analyzer.

An attacker can also use this equipment to create a constant jammer, which continuously transmits noise, and makes the channel completely unusable. In principle an attacker could also turn on a badly shielded microwave to jam the channel :-) However, that doesn't give the attacker control over which frequency is jammed, bandwidth of the emitted noise, nor the emitted noise pattern.

It's even possible to implement a more sophisticated selective jammer. Such a jammer is capable of only jamming specific packets (e.g. only packets sent by a certain device). While it's already known this is possible using expensive hardware such as USRPs, we found that even cheap WiFi dongles can be used to implement all these attacks. It's especially surprising that a selective jammer can be implemented on a cheap WiFi dongle, since it must be fast enough to detect and subsequently jam a packet.

Our attacks were first published at ACSAC 2014 and subsequently demonstrated at BruCON 2015. Additionally our code is available on github! You can also view my presentation at BruCON online.

Selfish Behavior

When a station wants to transmit a packet, but the channel is already in use by another device, it first waits until this transmission is done. Then it waits a random period (called the backoff period), and if no one else starts transmitting during this period, the station may send the packet. We found that it's easy to disable the random backoff period on commodity devices, meaning it will instantly transmit packets. More interestingly, we found that even the original driver and firmware for our device incorrectly calculated the backoff period, giving itself an unfair advantage over other stations! It turns out that many devices do this to gain an unfair advantage:
Over six Wi-Fi cards, neither one performs as expected. In some cases, implementation issues seem to affect the proper card operation. In other cases, manufacturers rely on backoff parameters different from the standard specification, this perhaps being done on purpose to provide an unfair advantage. [Minimized conclusion from this paper]
This raises the question what would happen if there are two stations that both behave selfishly by disabling the backoff period. In other words, what happens if two selfish stations instantly transmit all packets they have queued? You may think that the packets of both selfish stations will collide, and as a result both are lost in the collision. It turns out this is not the case! Due to the capture effect, the packet having the highest signal quality and lowest bitrate will get decoded properly. You can compare this to receiving two radio stations on the same frequency, where generally one station will "win the collision" (see this demo video). This means selfish stations will abuse the capture effect, and reduce their bitrate, in order to win the collision (and have their packet decoded correctly by the receiver). Surprisingly, we now get that selfish clients wanting to maximize their throughput, will reduce their bitrate!

One of our WiFi dongles attached to a Raspberry Pi

There is a research prototype called DOMINO which can detect and punish this type of selfish behavior. Unfortunately we discovered a critical flaw in this system. The problem is that this system bases some of its decisions on unauthenticated packets. As an attacker we can abuse this by forging packets which appear to be sent by a different station. Hence we can make DOMINO believe this client is acting selfishly, meaning it will be punished and thrown off the network. Moral of the story: only base your decisions on authenticated or hard-to-forge data.

Constant Jammer

Our WiFi dongles can also be used to implement a constant jammer. The idea is straightforward: make the radio instantly transmit packets (even if someone else is transmitting), and then send an infinite amount of packets without interruption. The second part is tedious, since we can't queue many packets due to the limited amount of memory available in the WiFi dongle. However, we can simulate an infinite amount of packets. The packets queued for transmission are stored in a linked list. By turning this queue into a circular list we can simulate an infinite amount of packets.


What's interesting here is that in principle the jammer is constantly transmitting valid WiFi packets. However, because they are sent so fast after one another, other WiFi devices are unable to detect these packets. In other words, other WiFi devices operating in monitor mode only see noise, and will not detect/show and show any WiFi traffic.

Selective Jammer

Arguably the most impressive result is that our cheap dongle can be used to implement a selective (also called reactive) jammer. Such a jammer decodes the header of a packet still in the air, and based on information in the header, decides whether to jam the remaining content or not. This is not an easy feat to accomplish. The selective jammer must be fast enough to decode the header, make the decision to jam the packet, put the antenna in transmit mode, and finally jam the packet. All this needs to be done in just a few microseconds (an average WiFi packet takes ~200 microseconds to transmit)!

Jamming the end of the packet is easy, simply inject a packet like we did for the continuous jammer. But there is no support or API to be notified when a packet is in the progress of being received. How do we get around this? The important realization is that there are two chips inside our WiFi device. The first one is the radio which processes the incoming physical signal, and uses Direct Memory Access (DMA) to write the packet to memory. The second chip is the main CPU which is responsible for communicating with the host over USB and controlling the radio chip. Hence we can use the main CPU to constantly monitor the memory where the packet will be saved. Once we detect that the radio chip is writing bytes to this memory location, we know a frame is being received:

When the memory is modified, we know a frame is being received.

With this clever trick we can detect when a frame is being received. Our jammer then reads the MAC address(es) in the header, and compares it to the MAC address of the station we are targeting. If they match, the remaining content of the packet is jammed. This will cause the CRC (called the ICV in WiFi) of the packet to be wrong, meaning the packet will be dropped by the receiver.



The code of the reactive jammer is public, feel free to play around with it (against a test network).

Channel-Based Man-in-the-Middle Attack

As an example, we show how these low-layer attacks can be used to reliably manipulate encrypted traffic. Note that our goal is not to decrypt traffic. Instead, we want to be able to reliably drop, modify, and inject packets. This ability is typically required in order to launch actual (cryptographic) attacks against an higher-layer protocol. Previously when targeting wireless traffic, it was not clear how to do this: there were no known methods to obtain a (reliable) man-in-the-middle position between a client and access point if encryption is used.

In order to intercept (not decrypt) all traffic of an encrypted wireless network, we cannot simply create a rogue Access Point (AP) with a different MAC address. When a client connects to the access point, not only are the credentials verified (e.g. shared password), but also the MAC addresses of the client and access point. Hence the client and AP will detect that an attacker was forwarding packets under different MAC addresses. Setting up a rouge AP with the same mac address as the real AP is futile: the client will simply directly communicate with the real AP (unless they are out of range of each other).

Our solution is to clone the access point on a different channel, but with the same MAC address as the real access point. We forward all frames to the real AP. In other words, we forward packets between both channels. Using the constant jammer we force clients to switch to the channel where our rogue AP is located. Since we did not modify the MAC address of the AP, and also didn't modify the MAC address of the client, the client will successfully connect to the (rogue) AP. We now have a (channel-based) man-in-the-middle position, allowing reliable manipulation of encrypted traffic.

Demonstration of the Channel-Based MiTM attack against WPA-TKIP

The channel-based man-in-the-middle attack can be used to break WPA-TKIP (people commonly, but incorrectly, refer to TKIP as WPA, and EAS-CCMP as WPA2). The TKIP protocol was supposed to be a temporarily replacement of WEP, designed to run on old hardware. Unfortunately, for backwards compatibility, many networks still support both TKIP and the newer EAS-CCMP. If both these protocols are supported, the older TKIP protocol is used to encrypt all broadcast packets. Without going into detail, the channel-based man-in-the-middle allows us to apply existing TKIP attacks to broadcast packets. Hence you should configure your network so only (EAS-)CCMP is enabled!

Wednesday, 23 September 2015

Solving CSAW challenge Wyvern (rev-500) using afl-fuzz

We're given a Linux binary which asks for input:

 +-----------------------+
 |    Welcome Hero       |
 +-----------------------+

 [!] Quest: there is a dragon prowling the domain.
     brute strength and magic is our only hope. Test your skill.

 Enter the dragon's secret: xxxx_find_secret

 [-] You have failed. The dragon's power, speed and intelligence

     was greater.

Opening the binary in IDA shows that the main() function prints the above output. The input is processed by start_quest(std::string). If this function returns 0x1337 the level is solved.

Looking at the start_quest function, it first pushes 28 integers in a vector, does some "weird" if-tests, and finally calls sanitize_input. Interestingly, this function again contains a lot of similar and weird if-tests of the form:

    if ( y4 < 10 || (((_BYTE)x3 - 1) * (_BYTE)x3 & 1) == 0 )
        break;

It appears this is some kind of obfuscation. At this point I guessed (and hoped) that the input would be checked character by character. This means that if we guess a character correctly, the program will execute a (new) different path in the program. This is ideal for afl-fuzz, a fuzzer that detects "interesting fuzzing candidates" based on which path a program executes.

Compiling afl-fuzz with afl-qemu-trace support allows us to fuzz binaries without access to source code:

    $ mkdir indir
    $ echo "test" > indir/test1.bin
    $ afl-fuzz -Q -i indir -o sync_dir -M fuzzer01 ./wyvern
    $ afl-fuzz -Q -i indir -o sync_dir -S fuzzer02 ./wyvern
    ...
    $ afl-fuzz -Q -i indir -o sync_dir -S fuzzer07 ./wyvern  

While the fuzzer will not find crashes, it does find inputs that trigger new execution paths in the program! It took roughly half an hour of fuzzing on 7 cores to get some interesting files:

$ xxd id\:000006\,src\:000005\,op\:flip1\,pos\:4
00: 6472 3467 3034 3434 3434 3434 34cd 617c  dr4g044444444.a|
10: 6166 e803 666c 167c 6170 6116 47         af..fl.|apa.G

All generated files are 28 bytes long. That's the number of integers pushed into the vector! And it seems it must start with dr4g0n. The integers being put into the vector are:

    array = [0x64, 0xD6, 0x10A, 0x171, 0x1A1, 0x20F, 0x26E,
       0x2DD, 0x34F, 0x3AE, 0x41E, 0x452, 0x4C6, 0x538,
       0x5A1, 0x604, 0x635, 0x696, 0x704, 0x763, 0x7CC,
       0x840, 0x875, 0x8D4, 0x920, 0x96C, 0x9C2, 0xA0F]

What's interesting is that 0x64 is the ascii value of "d". Does this vector somehow encode the flag? We know the second letter is "r" with value 0x72. How can we get this value? By subtracting the first integer, 0x64, from the subsequent one: 0xD6 - 0x64 = 0x72! The solution is:

    flag = ""
    base = 0
    for num in array:
        flag += chr(num - base)
        base = num

    print flag


Flag: dr4g0n_or_p4tric1an_it5_LLVM

In hindsight it would've also been possible to use something like Intel PIN to count the number of executed instructions. The advantage of afl-fuzz is that it requires zero configuration: just give it some random start value and it works.