| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 1 | # The Rule Of 2 |
| 2 | |
| 3 | When you write code to parse, evaluate, or otherwise handle untrustworthy inputs |
| 4 | from the Internet — which is almost everything we do in a web browser! — we like |
| 5 | to follow a simple rule to make sure it's safe enough to do so. The Rule Of 2 |
| 6 | is: Pick no more than 2 of |
| 7 | |
| 8 | * untrustworthy inputs; |
| 9 | * unsafe implementation language; and |
| 10 | * high privilege. |
| 11 | |
| Adrian Taylor | e1f3490 | 2019-08-10 01:07:45 | [diff] [blame] | 12 |  |
| Adrian Taylor | e1f3490 | 2019-08-10 01:07:45 | [diff] [blame] | 15 | |
| 16 | (drawing source |
| Adrian Taylor | a4aa016 | 2019-08-12 20:33:16 | [diff] [blame] | 17 | [here](https://docs.google.com/drawings/d/12WoPI7-E5NAINHUZqEPGn38aZBYBxq20BgVBjZIvgCQ/edit?usp=sharing)) |
| Adrian Taylor | e1f3490 | 2019-08-10 01:07:45 | [diff] [blame] | 18 | |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 19 | ## Why? |
| 20 | |
| 21 | When code that handles untrustworthy inputs at high privilege has bugs, the |
| 22 | resulting vulnerabilities are typically of Critical or High severity. (See our |
| 23 | [Severity Guidelines](severity-guidelines.md).) We'd love to reduce the severity |
| 24 | of such bugs by reducing the amount of damage they can do (lowering their |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 25 | privilege), avoiding the various types of memory corruption bugs (using a safe |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 26 | language), or reducing the likelihood that the input is malicious (asserting the |
| 27 | trustworthiness of the source). |
| 28 | |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 29 | For the purposes of this document, our main concern is reducing (and hopefully, |
| 30 | ultimately eliminating) bugs that arise due to _memory unsafety_. [A recent |
| 31 | study by Matt Miller from Microsoft |
| 32 | Security](https://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdf) |
| 33 | states that "~70% of the vulnerabilities addressed through a security update |
| 34 | each year continue to be memory safety issues". A trip through Chromium's bug |
| 35 | tracker will show many, many vulnerabilities whose root cause is memory |
| Chris Palmer | 57171f92 | 2019-03-08 22:42:28 | [diff] [blame] | 36 | unsafety. (As of March 2019, only about 5 of 130 [public Critical-severity |
| 37 | bugs](https://bugs.chromium.org/p/chromium/issues/list?can=1&q=Type%3DBug-Security+Security_Severity%3DCritical+-status%3AWontFix+-status%3ADuplicate&sort=&groupby=&colspec=ID+Pri+M+Stars+ReleaseBlock+Component+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&mode=&cells=ids&num=) |
| 38 | are not obviously due to memory corruption.) |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 39 | |
| 40 | Security engineers in general, very much including Chrome Security Team, would |
| 41 | like to advance the state of engineering to where memory safety issues are much |
| 42 | more rare. Then, we could focus more attention on the application-semantic |
| 43 | vulnerabilities. 😊 That would be a big improvement. |
| 44 | |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 45 | ## What? |
| 46 | |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 47 | Some definitions are in order. |
| 48 | |
| 49 | ### Untrustworthy Inputs |
| 50 | |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 51 | _Untrustworthy inputs_ are inputs that |
| 52 | |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 53 | * have non-trivial grammars; and/or |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 54 | * come from untrustworthy sources. |
| 55 | |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 56 | If there were an input type so simple that it were straightforward to write a |
| 57 | memory-safe handler for it, we wouldn't need to worry much about where it came |
| 58 | from **for the purposes of memory safety**, because we'd be sure we could handle |
| 59 | it. We would still need to treat the input as untrustworthy after |
| 60 | parsing, of course. |
| 61 | |
| Chris Palmer | 42cd401 | 2019-01-26 02:06:07 | [diff] [blame] | 62 | Unfortunately, it is very rare to find a grammar trivial enough that we can |
| 63 | trust ourselves to parse it successfully or fail safely. (But see |
| Jeremy Roman | 67611d89 | 2022-10-13 17:45:01 | [diff] [blame] | 64 | [Normalization](#normalization) for a potential example.) Therefore, we do need |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 65 | to concern ourselves with the provenance of such inputs. |
| Chris Palmer | 42cd401 | 2019-01-26 02:06:07 | [diff] [blame] | 66 | |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 67 | Any arbitrary peer on the Internet is an untrustworthy source, unless we get |
| 68 | some evidence of its trustworthiness (which includes at least [a strong |
| 69 | assertion of the source's |
| 70 | identity](#verifying-the-trustworthiness-of-a-source)). When we can know with |
| 71 | certainty that an input is coming from the same source as the application itself |
| 72 | (e.g. Google in the case of Chrome, or Mozilla in the case of Firefox), and that |
| 73 | the transport is integrity-protected (such as with HTTPS), then it can be |
| 74 | acceptable to parse even complex inputs from that source. It's still ideal, |
| Chris Palmer | 57171f92 | 2019-03-08 22:42:28 | [diff] [blame] | 75 | where feasible, to reduce our degree of trust in the source — such as by parsing |
| 76 | the input in a sandbox. |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 77 | |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 78 | ### Unsafe Implementation Languages |
| 79 | |
| 80 | _Unsafe implementation languages_ are languages that lack [memory |
| 81 | safety](https://en.wikipedia.org/wiki/Memory_safety), including at least C, C++, |
| 82 | and assembly language. Memory-safe languages include Go, Rust, Python, Java, |
| 83 | JavaScript, Kotlin, and Swift. (Note that the safe subsets of these languages |
| 84 | are safe by design, but of course implementation quality is a different story.) |
| 85 | |
| danakj | bb48339e75 | 2022-10-06 22:59:35 | [diff] [blame] | 86 | #### Unsafe Code in Safe Languages |
| 87 | |
| 88 | Some memory-safe languages provide a backdoor to unsafety, such as the `unsafe` |
| 89 | keyword in Rust. This functions as a separate unsafe language subset inside the |
| 90 | memory-safe one. |
| 91 | |
| 92 | The presence of unsafe code does not negate the memory-safety properties of the |
| 93 | memory-safe language around it as a whole, but _how_ unsafe code is used is |
| 94 | critical. Poor use of an unsafe language subset is not meaningfully different |
| 95 | from any other unsafe implementation language. |
| 96 | |
| 97 | In order for a library with unsafe code to be safe for the purposes of the Rule |
| 98 | of 2, all unsafe usage must be able to be reviewed and verified by humans with |
| 99 | simple local reasoning. To achieve this, we expect all unsafe usage to be: |
| 100 | * Small: The minimal possible amount of code to perform the required task |
| 101 | * Encapsulated: All access to the unsafe code is through a safe API |
| 102 | * Documented: All preconditions of an unsafe block (e.g. a call to an unsafe |
| 103 | function) are spelled out in comments, along with explanations of how they are |
| 104 | satisfied. |
| 105 | |
| 106 | Because unsafe code reaches outside the normal expectations of a memory-safe |
| 107 | language, it must follow strict rules to avoid undefined behaviour and |
| 108 | memory-safety violations, and these are not always easy to verify. A careful |
| 109 | review by one or more experts in the unsafe language subset is required. |
| 110 | |
| 111 | It should be safe to use any code in a memory-safe language in a high-privilege |
| 112 | context. As such, the requirements on a memory-safe language implementation are |
| 113 | higher: All code in a memory-safe language must be capable of satisfying the |
| 114 | Rule of 2 in a high-privilege context (including any unsafe code) in order to be |
| 115 | used or admitted anywhere in the project. |
| 116 | |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 117 | ### High Privilege |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 118 | |
| 119 | _High privilege_ is a relative term. The very highest-privilege programs are the |
| 120 | computer's firmware, the bootloader, the kernel, any hypervisor or virtual |
| 121 | machine monitor, and so on. Below that are processes that run as an OS-level |
| danakj | bb4d0c77 | 2023-10-13 13:22:28 | [diff] [blame] | 122 | account representing a person; this includes the Chrome Browser process and Gpu |
| 123 | process. We consider such processes to have high privilege. (After all, they |
| 124 | can do anything the person can do, with any and all of the person's valuable |
| 125 | data and accounts.) |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 126 | |
| danakj | bb4d0c77 | 2023-10-13 13:22:28 | [diff] [blame] | 127 | Processes with slightly reduced privilege will (hopefully soon) include the |
| 128 | network process. These are still pretty high-privilege processes. We are always |
| 129 | looking for ways to reduce their privilege without breaking them. |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 130 | |
| 131 | Low-privilege processes include sandboxed utility processes and renderer |
| danakj | bb4d0c77 | 2023-10-13 13:22:28 | [diff] [blame] | 132 | processes with [Site Isolation]( |
| 133 | https://www.chromium.org/Home/chromium-security/site-isolation) (very good) or |
| 134 | [origin isolation]( |
| 135 | https://cloud.google.com/docs/chrome-enterprise/policies/?policy=IsolateOrigins) |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 136 | (even better). |
| 137 | |
| Chris Palmer | e4b62db5 | 2021-05-10 16:59:48 | [diff] [blame] | 138 | ### Processing, Parsing, And Deserializing |
| 139 | |
| 140 | Turning a stream of bytes into a structured object is hard to do correctly and |
| 141 | safely. For example, turning a stream of bytes into a sequence of Unicode code |
| 142 | points, and from there into an HTML DOM tree with all its elements, attributes, |
| 143 | and metadata, is very error-prone. The same is true of QUIC packets, video |
| 144 | frames, and so on. |
| 145 | |
| 146 | Whenever the code branches on the byte values it's processing, the risk |
| 147 | increases that an attacker can influence control flow and exploit bugs in the |
| 148 | implementation. |
| 149 | |
| 150 | Although we are all human and mistakes are always possible, a function that does |
| 151 | not branch on input values has a better chance of being free of vulnerabilities. |
| 152 | (Consider an arithmetic function, such as SHA-256, for example.) |
| 153 | |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 154 | ## Solutions To This Puzzle |
| 155 | |
| Alex Gaynor | 5697511 | 2019-02-07 19:15:07 | [diff] [blame] | 156 | Chrome Security Team will generally not approve landing a CL or new feature |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 157 | that involves all 3 of untrustworthy inputs, unsafe language, and high |
| 158 | privilege. To solve this problem, you need to get rid of at least 1 of those 3 |
| 159 | things. Here are some ways to do that. |
| 160 | |
| danakj | bb4d0c77 | 2023-10-13 13:22:28 | [diff] [blame] | 161 | ### Safe Languages |
| 162 | |
| 163 | Where possible, it's great to use a memory-safe language. The following |
| 164 | memory-safe languages are approved for use in Chromium: |
| 165 | * Java (on Android only) |
| 166 | * Swift (on iOS only) |
| Minseop Choi | 2aad6d1f | 2025-05-16 03:39:05 | [diff] [blame] | 167 | * [Rust](../rust.md) (for [third-party use]( |
| 168 | ../adding_to_third_party.md#Rust)) |
| danakj | bb4d0c77 | 2023-10-13 13:22:28 | [diff] [blame] | 169 | * JavaScript or WebAssembly (although we don't currently use them in |
| 170 | high-privilege processes like the browser/gpu process) |
| 171 | |
| 172 | One can imagine Kotlin on Android, too, although it is not currently |
| 173 | used in Chromium. |
| 174 | |
| 175 | For an example of image processing, we have the pure-Java class |
| 176 | [BaseGifImage](https://cs.chromium.org/chromium/src/third_party/gif_player/src/jp/tomorrowkey/android/gifplayer/BaseGifImage.java?rcl=27febd503d1bab047d73df26db83184fff8d6620&l=27). |
| 177 | On Android, where we can use Java and also face a particularly high cost for |
| 178 | creating new processes (necessary for sandboxing), using Java to decode tricky |
| Dustin J. Mitchell | 038459f | 2025-02-19 15:11:22 | [diff] [blame] | 179 | formats can be a great approach. Before switching to a Rust-based parser, we |
| 180 | used a Java [JsonSanitizer](https://cs.chromium.org/chromium/src/services/data_decoder/public/cpp/android/java/src/org/chromium/services/data_decoder/JsonSanitizer.java), |
| danakj | bb4d0c77 | 2023-10-13 13:22:28 | [diff] [blame] | 181 | to 'vet' incoming JSON in a memory-safe way before passing the input to the C++ |
| 182 | JSON implementation. |
| 183 | |
| 184 | On Android, many system APIs that are exposed via Java are not actually |
| 185 | implemented in a safe language, and are instead just facades around an unsafe |
| 186 | implementation. A canonical example of this is the |
| 187 | [BitmapFactory](https://developer.android.com/reference/android/graphics/BitmapFactory) |
| 188 | class, which is a Java wrapper [around C++ |
| 189 | Skia](https://cs.android.com/android/platform/superproject/+/master:frameworks/base/libs/hwui/jni/BitmapFactory.cpp;l=586;drc=864d304156d1ef8985ee39c3c1858349b133b365). |
| 190 | These APIs are therefore not considered memory-safe under the rule. |
| 191 | |
| 192 | The [QR code generator]( |
| 193 | https://source.chromium.org/chromium/chromium/src/+/main:components/qr_code_generator/;l=1;drc=b185db5d502d4995627e09d62c6934590031a5f2) |
| 194 | is an example of a cross-platform memory-safe Rust library in use in Chromium. |
| 195 | |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 196 | ### Privilege Reduction |
| 197 | |
| 198 | Also known as [_sandboxing_](https://cs.chromium.org/chromium/src/sandbox/), |
| 199 | privilege reduction means running the code in a process that has had some or |
| 200 | many of its privileges revoked. |
| 201 | |
| 202 | When appropriate, try to handle the inputs in a renderer process that is Site |
| 203 | Isolated to the same site as the inputs come from. Take care to validate the |
| Chris Palmer | 57171f92 | 2019-03-08 22:42:28 | [diff] [blame] | 204 | parsed (processed) inputs in the browser, since only the browser can trust |
| 205 | itself to validate and act on the meaning of an object. |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 206 | |
| 207 | Equivalently, you can launch a sandboxed utility process to handle the data, and |
| Chris Palmer | 57171f92 | 2019-03-08 22:42:28 | [diff] [blame] | 208 | return a well-formed response back to the caller in an IPC message. See [Safe |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 209 | Browsing's ZIP |
| Chris Palmer | 57171f92 | 2019-03-08 22:42:28 | [diff] [blame] | 210 | analyzer](https://cs.chromium.org/chromium/src/chrome/common/safe_browsing/zip_analyzer.h) |
| Tim Sergeant | bc92ae8 | 2022-01-18 22:00:07 | [diff] [blame] | 211 | for an example. The [Data Decoder Service](https://source.chromium.org/chromium/chromium/src/+/main:services/data_decoder/public/cpp/data_decoder.h) |
| 212 | facilitates this safe decoding process for several common data formats. |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 213 | |
| 214 | ### Verifying The Trustworthiness Of A Source |
| 215 | |
| 216 | If you can be sure that the input comes from a trustworthy source, it can be OK |
| 217 | to parse/evaluate it at high privilege in an unsafe language. A "trustworthy |
| Adrian Taylor | 06469448 | 2020-05-12 00:12:58 | [diff] [blame] | 218 | source" means that Chromium can cryptographically prove that the data comes |
| 219 | from a business entity that you can or do trust (e.g. |
| 220 | for Chrome, an [Alphabet](https://abc.xyz) company). |
| 221 | |
| 222 | Such cryptographic proof can potentially be obtained by: |
| 223 | |
| 224 | * Component Updater; |
| Carlos IL | 66614d48 | 2022-10-05 17:34:54 | [diff] [blame] | 225 | * The variations framework. |
| Adrian Taylor | 06469448 | 2020-05-12 00:12:58 | [diff] [blame] | 226 | * Pinned TLS (see below). |
| 227 | |
| 228 | Pinned TLS needs to meet all these criteria to be effective: |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 229 | |
| 230 | * communication happens via validly-authenticated TLS, HTTPS, or QUIC; |
| Chris Palmer | 57171f92 | 2019-03-08 22:42:28 | [diff] [blame] | 231 | * the peer's keys are [pinned in Chrome](https://cs.chromium.org/chromium/src/net/http/transport_security_state_static.json?sq=package:chromium&g=0); and |
| Adrian Taylor | 06469448 | 2020-05-12 00:12:58 | [diff] [blame] | 232 | * pinning is active on all platforms where the feature will launch. |
| Carlos IL | 66614d48 | 2022-10-05 17:34:54 | [diff] [blame] | 233 | (Currently pinning is not enabled in iOS or Android WebView). |
| Adrian Taylor | 06469448 | 2020-05-12 00:12:58 | [diff] [blame] | 234 | |
| Carlos IL | 66614d48 | 2022-10-05 17:34:54 | [diff] [blame] | 235 | It is generally preferred to use Component Updater if possible because pinning |
| 236 | may be disabled by locally installed root certificates. |
| Adrian Taylor | 06469448 | 2020-05-12 00:12:58 | [diff] [blame] | 237 | |
| 238 | One common pattern is to deliver a cryptographic hash of some content via such |
| 239 | a trustworthy channel, but deliver the content itself via an untrustworthy |
| 240 | channel. So long as the hash is properly verified, that's fine. |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 241 | |
| Chris Palmer | 3230111 | 2019-02-06 00:02:56 | [diff] [blame] | 242 | ### Normalization {#normalization} |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 243 | |
| 244 | You can 'defang' a potentially-malicious input by transforming it into a |
| Chris Palmer | 42cd401 | 2019-01-26 02:06:07 | [diff] [blame] | 245 | _normal_ or _minimal_ form, usually by first transforming it into a format with |
| Chris Palmer | f4bff3f | 2019-02-05 19:51:55 | [diff] [blame] | 246 | a simpler grammar. We say that all data, file, and wire formats are defined by a |
| 247 | _grammar_, even if that grammar is implicit or only partially-specified (as is |
| Chris Palmer | f587d3f | 2021-11-03 00:37:47 | [diff] [blame] | 248 | so often the case). A data format with a particularly simple grammar is |
| 249 | [`SkPixmap`](https://source.chromium.org/chromium/chromium/src/+/3df9ac8e76132c586e888d1ddc7d2217574f17b0:third_party/skia/include/core/SkPixmap.h;l=712). |
| 250 | (The 'grammar' is represented by the private data fields: a region of raw pixel |
| 251 | data, the size of that region, and simple metadata (`SkImageInfo`) about how to |
| 252 | interpret the pixels.) |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 253 | |
| 254 | It's rare to find such a simple grammar for input formats, however. |
| Chris Palmer | 42cd401 | 2019-01-26 02:06:07 | [diff] [blame] | 255 | |
| 256 | For example, consider the PNG image format, which is complex and whose [C |
| 257 | implementation has suffered from memory corruption bugs in the |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 258 | past](https://www.cvedetails.com/vulnerability-list/vendor_id-7294/Libpng.html). |
| Chris Palmer | 42cd401 | 2019-01-26 02:06:07 | [diff] [blame] | 259 | An attacker could craft a malicious PNG to trigger such a bug. But if you |
| 260 | transform the image into a format that doesn't have PNG's complexity (in a |
| 261 | low-privilege process, of course), the malicious nature of the PNG 'should' be |
| 262 | eliminated and then safe for parsing at a higher privilege level. Even if the |
| 263 | attacker manages to compromise the low-privilege process with a malicious PNG, |
| 264 | the high-privilege process will only parse the compromised process' output with |
| 265 | a simple, plausibly-safe parser. If that parse is successful, the |
| 266 | higher-privilege process can then optionally further transform it into a |
| 267 | normalized, minimal form (such as to save space). Otherwise, the parse can fail |
| 268 | safely, without memory corruption. |
| 269 | |
| 270 | The trick of this technique lies in finding a sufficiently-trivial grammar, and |
| 271 | committing to its limitations. |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 272 | |
| Chris Palmer | 9bee1fc | 2019-04-03 20:04:40 | [diff] [blame] | 273 | Another good approach is to |
| Chris Palmer | f4bff3f | 2019-02-05 19:51:55 | [diff] [blame] | 274 | |
| Chris Palmer | 9bee1fc | 2019-04-03 20:04:40 | [diff] [blame] | 275 | 1. define a new Mojo message type for the information you want; |
| 276 | 2. extract that information from a complex input object in a sandboxed |
| 277 | process; and then |
| 278 | 3. send the result to a higher-privileged process in a Mojo message using the |
| 279 | new message type. |
| 280 | |
| 281 | That way, the higher-privileged process need only process objects adhering to a |
| 282 | well-defined, generally low-complexity grammar. This is a big part of why [we |
| 283 | like for Mojo messages to use structured types](mojo.md#Use-structured-types). |
| 284 | |
| 285 | For example, it should be safe enough to convert a PNG to an `SkBitmap` in a |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 286 | sandboxed process, and then send the `SkBitmap` to a higher-privileged process |
| 287 | via IPC. Although there may be bugs in the IPC message deserialization code |
| 288 | and/or in Skia's `SkBitmap` handling code, we consider this safe enough for a |
| 289 | few reasons: |
| 290 | |
| Chris Palmer | 57171f92 | 2019-03-08 22:42:28 | [diff] [blame] | 291 | * we must accept the risk of bugs in Mojo deserialization; but thankfully |
| 292 | * Mojo deserialization is very amenable to fuzzing; and |
| Chris Palmer | 9bee1fc | 2019-04-03 20:04:40 | [diff] [blame] | 293 | * it's a big improvement to scope bugs to smaller areas, like IPC |
| 294 | deserialization functions and very simple classes like `SkBitmap` and |
| 295 | `SkPixmap`. |
| 296 | |
| 297 | Ultimately this process results in parsing significantly simpler grammars. (PNG |
| 298 | → Mojo + `SkBitmap` in this case.) |
| Chris Palmer | 8070803 | 2019-03-06 20:21:28 | [diff] [blame] | 299 | |
| 300 | > (We have to accept the risk of memory safety bugs in Mojo deserialization |
| 301 | > because C++'s high performance is crucial in such a throughput- and |
| 302 | > latency-sensitive area. If we could change this code to be both in a safer |
| 303 | > language and still have such high performance, that'd be ideal. But that's |
| 304 | > unlikely to happen soon.) |
| 305 | |
| Alex Gough | df5ea3c | 2024-03-28 22:36:20 | [diff] [blame] | 306 | ### Exception: Protobuf |
| 307 | |
| Robert Sesek | 16cedb5 | 2020-10-19 22:43:09 | [diff] [blame] | 308 | While less preferable to Mojo, we also similarly trust Protobuf for |
| 309 | deserializing messages at high privilege from potentially untrustworthy senders. |
| 310 | For example, Protobufs are sometimes embedded in Mojo IPC messages. It is |
| 311 | always preferable to use a Mojo message where possible, though sometimes |
| Alex Gough | df5ea3c | 2024-03-28 22:36:20 | [diff] [blame] | 312 | external constraints require the use of Protobuf. |
| 313 | |
| 314 | Protobuf's threat model does not include parsing a protobuf from shared |
| 315 | memory. Always copy the proto buffer bytes from untrustworthy shared |
| 316 | memory regions before deserializing to a Message. |
| 317 | |
| 318 | If you must pass protobuf bytes over mojo use |
| 319 | [mojo_base::ProtoWrapper](https://chromium.googlesource.com/chromium/src/+/main/mojo/public/cpp/base/proto_wrapper.h) |
| 320 | as this provides limited type safety for the top-level protobuf message and |
| 321 | ensures copies are taken before deserializing. |
| 322 | |
| 323 | Note that this exception only applies to Protobuf as a container format; |
| 324 | complex data contained within a Protobuf must be handled according to this |
| 325 | rule as well. |
| 326 | |
| 327 | ### Exception: RE2 |
| Robert Sesek | 16cedb5 | 2020-10-19 22:43:09 | [diff] [blame] | 328 | |
| Matthew Riley | 27b059622 | 2023-05-31 22:43:56 | [diff] [blame] | 329 | As another special case, we trust the |
| 330 | [RE2](https://cs.chromium.org/chromium/src/third_party/re2/README.chromium) |
| 331 | regular expression library to evaluate untrustworthy patterns over untrustworthy |
| 332 | input strings, because its grammar is sufficiently limited and hostile input is |
| 333 | part of the threat model against which it's been tested for years. It is **not** |
| 334 | the case, however, that text matched by an RE2 regular expression is necessarily |
| 335 | "sanitized" or "safe". That requires additional security judgment. |
| 336 | |
| Dustin J. Mitchell | 038459f | 2025-02-19 15:11:22 | [diff] [blame] | 337 | ## Safe Types and Abstractions |
| Robert Sesek | f64a25f7 | 2021-02-26 00:23:24 | [diff] [blame] | 338 | |
| 339 | As discussed above in [Normalization](#normalization), there are some types that |
| 340 | are considered "safe," even though they are deserialized from an untrustworthy |
| 341 | source, at high privilege, and in an unsafe language. These types are |
| 342 | fundamental for passing data between processes using IPC, tend to have simpler |
| 343 | grammar or structure, and/or have been audited or fuzzed heavily. |
| 344 | |
| Chris Palmer | 1908673 | 2021-05-07 18:17:50 | [diff] [blame] | 345 | * `GURL` and `url::Origin` |
| danakj | bb48339e75 | 2022-10-06 22:59:35 | [diff] [blame] | 346 | * `SkBitmap` (in [N32 format](https://source.chromium.org/chromium/chromium/src/+/main:third_party/skia/include/core/SkColorType.h;l=54-58;drc=8d399817282e3c12ed54eb23ec42a5e418298ec6) only) |
| 347 | * `SkPixmap` (in [N32 format](https://source.chromium.org/chromium/chromium/src/+/main:third_party/skia/include/core/SkColorType.h;l=54-58;drc=8d399817282e3c12ed54eb23ec42a5e418298ec6) only) |
| Robert Sesek | f64a25f7 | 2021-02-26 00:23:24 | [diff] [blame] | 348 | * Protocol buffers (see above; this is not a preferred option and should be |
| 349 | avoided where possible) |
| 350 | |
| danakj | bb48339e75 | 2022-10-06 22:59:35 | [diff] [blame] | 351 | There are also classes in `//base` that internally hold simple values that |
| Robert Sesek | f64a25f7 | 2021-02-26 00:23:24 | [diff] [blame] | 352 | represent potentially complex data, such as: |
| 353 | |
| 354 | * `base::FilePath` |
| 355 | * `base::Token` and `base::UnguessableToken` |
| 356 | * `base::Time` and `base::TimeDelta` |
| 357 | |
| 358 | The deserialization of these is safe, though it is important to remember that |
| 359 | the value itself is still untrustworthy (e.g. a malicious path trying to escape |
| 360 | its parent using `../`). |
| 361 | |
| Dustin J. Mitchell | 038459f | 2025-02-19 15:11:22 | [diff] [blame] | 362 | The JSON parser in `//base/json` is implemented in Rust and considered safe for |
| 363 | use at high privilege with untrusted data. |
| 364 | |
| Chris Palmer | aef94dd | 2019-01-18 00:34:15 | [diff] [blame] | 365 | ## Existing Code That Violates The Rule |
| 366 | |
| Matthew Riley | 7cfc270 | 2025-10-28 23:28:12 | [diff] [blame] | 367 | We know there is code in Chromium that violates the Rule of 2. For example, the |
| 368 | networking process on Windows is written in C++ and handles plenty of |
| 369 | untrustworthy data, yet it is not (at present) sandboxed by default. There is |
| 370 | [ongoing work](https://bugs.chromium.org/p/chromium/issues/detail?id=841001) to |
| 371 | change that. |
| 372 | |
| 373 | Our top priority is avoiding any *new* violations of the Rule of 2. We also try |
| 374 | to keep track of existing violations and mitigate them over time: for example, |
| 375 | some less-safe uses of JSON parsing in the privileged browser process were |
| 376 | defanged when we swapped out our C++ JSON parser for one written in Rust. |