Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: rust-lang/regex
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 1.10.2
Choose a base ref
...
head repository: rust-lang/regex
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 1.10.4
Choose a head ref
  • 20 commits
  • 28 files changed
  • 4 contributors

Commits on Oct 20, 2023

  1. automata: fix panic in dense DFA deserialization

    This fixes a hole in the validation logic that accidentally permitted a
    dense DFA to contain a match state with zero pattern IDs. Since search
    code is permitted to assume that every match state has at least one
    corresponding pattern ID, this led to a panic.
    
    Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63391
    BurntSushi committed Oct 20, 2023
    Configuration menu
    Copy the full SHA
    20b5317 View commit details
    Browse the repository at this point in the history

Commits on Oct 25, 2023

  1. syntax: add Hir::literal example for char

    The example shows a succinct way of creating an HIR literal from a
    `char` value by first encoding it to UTF-8.
    
    Closes #1114
    BurntSushi committed Oct 25, 2023
    Configuration menu
    Copy the full SHA
    6b72eec View commit details
    Browse the repository at this point in the history

Commits on Nov 1, 2023

  1. cli: change --no-captures to --captures (all|implicit|none)

    When we added the WhichCaptures type, we didn't update the CLI to expose
    the full functionality. This change does that.
    BurntSushi committed Nov 1, 2023
    Configuration menu
    Copy the full SHA
    662a8b9 View commit details
    Browse the repository at this point in the history
  2. regex-cli-0.2.0

    BurntSushi committed Nov 1, 2023
    Configuration menu
    Copy the full SHA
    837fd85 View commit details
    Browse the repository at this point in the history

Commits on Dec 5, 2023

  1. Configuration menu
    Copy the full SHA
    4f5992f View commit details
    Browse the repository at this point in the history

Commits on Dec 20, 2023

  1. doc: fix link in Index<&str> impl docs

    This referenced `Captures::get`, but it should reference
    `Captures::name`. This was likely a transcription error
    from the docs for the `Index<usize>` impl.
    kloune authored Dec 20, 2023
    Configuration menu
    Copy the full SHA
    a3d5975 View commit details
    Browse the repository at this point in the history

Commits on Dec 29, 2023

  1. ci: small clean-ups

    The regex 1.10 release bumped the MSRV to Rust 1.65, so we no longer
    need to pin to an older memchr release.
    
    We also bump to `actions/checkout@v4`.
    BurntSushi committed Dec 29, 2023
    Configuration menu
    Copy the full SHA
    dc0a9d2 View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2024

  1. cargo: set 'default-features = false' for memchr and aho-corasick

    I'm not sure how this one slipped by. Without this, I'd suppose that
    no-std support doesn't actually work? Or at least, one would have to
    disable the use of both memchr and aho-corasick entirely, since they
    depend on std by default. Not quite sure how to test this.
    
    Fixes #1147
    BurntSushi committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    027eebd View commit details
    Browse the repository at this point in the history

Commits on Jan 21, 2024

  1. safety: guard in Input::new against incorrect AsRef implementations

    Before this commit, Input::new calls haystack.as_ref() twice, once to
    get the actual haystack slice and the second time to get its length. It
    makes the assumption that the second call will return the same slice,
    but malicious implementations of AsRef can return different slices
    and thus different lengths. This is important because there's unsafe
    code relying on the Input's span being inbounds with respect to the
    haystack, but if the second call to .as_ref() returns a bigger slice
    this won't be true.
    
    For example, this snippet causes Miri to report UB on an unchecked
    slice access in find_fwd_imp (though it will also panic sometime later
    when run normally, but at that point the UB already happened):
    
        use regex_automata::{Input, meta::{Builder, Config}};
        use std::cell::Cell;
    
        struct Bad(Cell<bool>);
    
        impl AsRef<[u8]> for Bad {
            fn as_ref(&self) -> &[u8] {
                if self.0.replace(false) {
                    &[]
                } else {
                    &[0; 1000]
                }
            }
        }
    
        let bad = Bad(Cell::new(true));
        let input = Input::new(&bad);
        let regex = Builder::new()
            // Not setting this causes some checked access to occur before
            // the unchecked ones, avoiding the UB
            .configure(Config::new().auto_prefilter(false))
            .build("a+")
            .unwrap();
        regex.find(input);
    
    This commit fixes the problem by just calling .as_ref() once and use
    the length of the returned slice as the span's end value. A regression
    test has also been added.
    
    Closes #1154
    SkiFire13 authored and BurntSushi committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    fbd2537 View commit details
    Browse the repository at this point in the history
  2. changelog: 1.10.3

    BurntSushi committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    1bc667d View commit details
    Browse the repository at this point in the history
  3. regex-automata-0.4.4

    BurntSushi committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    e7b5401 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    653bb59 View commit details
    Browse the repository at this point in the history
  5. 1.10.3

    BurntSushi committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    0c09903 View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2024

  1. automata: make additional prefileter metadata public

    This commit exposes `is_fast` and also adds `max_needle_len`
    to a prefilter. This is useful for engines implemented outside
    of `regex-automata`.
    
    PR #1156
    pascalkuthe authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    07ef7f1 View commit details
    Browse the repository at this point in the history
  2. regex-automata-0.4.5

    BurntSushi committed Jan 25, 2024
    Configuration menu
    Copy the full SHA
    d7f9347 View commit details
    Browse the repository at this point in the history

Commits on Feb 26, 2024

  1. style: clean up some recent lint violations

    It looks like `dead_code` got a little smarter, and more pervasively,
    some new lint that detects superfluous imports found a bunch of them.
    BurntSushi committed Feb 26, 2024
    Configuration menu
    Copy the full SHA
    10fe722 View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2024

  1. automata: fix bug where reverse NFA lacked an unanchored prefix

    Previously, when compiling a Thompson NFA, we were omitting an
    unanchored prefix when the HIR contained a `^` in its prefix. We did
    this because unanchored prefix in that case would never match because of
    the requirement imposed by `^`.
    
    The problem with that is it's incorrect when compiling a reverse
    automaton. For example, in the case of building a reverse NFA for `^Qu`,
    we should sitll include an unanchored prefix because the `^` in that
    case has no conflict with it. It would be like if we omitted an
    unanchored prefix for `Qu$` in a forward NFA, which is obviously wrong.
    
    The fix here is pretty simple: in the reverse case, check for `$` in the
    suffix of the HIR rather than a `^` in the prefix.
    
    Fixes #1169
    BurntSushi committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    9cf4a42 View commit details
    Browse the repository at this point in the history
  2. regex-automata-0.4.6

    BurntSushi committed Mar 4, 2024
    Configuration menu
    Copy the full SHA
    a5ae351 View commit details
    Browse the repository at this point in the history

Commits on Mar 23, 2024

  1. api: add Cow guarantee to replace API

    This adds a guarantee to the API of the `replace`, `replace_all` and
    `replacen` routines that, when `Cow::Borrowed` is returned, it is
    guaranteed that it is equivalent to the `haystack` given.
    
    The implementation has always matched this behavior, but this elevates
    the implementation behavior to an API guarantee.
    
    There do exists implementations where this guarantee might not be upheld
    in every case. For example, if the final result were the empty string,
    we could return a `Cow::Borrowed`. Similarly, if the final result were a
    substring of `haystack`, then `Cow::Borrowed` could be returned in that
    case too. In practice, these sorts of optimizations are tricky to do in
    practice, and seem like niche corner cases that aren't important to
    optimize.
    
    Nevertheless, having this guarantee is useful because it can be used as
    a signal that the original input remains unchanged. This came up in
    discussions with @quicknir on Discord. Namely, in cases where one is
    doing a sequence of replacements and in most cases nothing is replaced,
    using a `Cow` is nice to be able to avoid copying the haystack over and
    over again. But to get this to work right, you have to know whether a
    `Cow::Borrowed` matches the input or not. If it doesn't, then you'd need
    to transform it into an owned string. For example, this code tries to do
    replacements on each of a sequence of `Cow<str>` values, where the
    common case is no replacement:
    
    ```rust
    use std::borrow::Cow;
    
    use regex::Regex;
    
    fn trim_strs(strs: &mut Vec<Cow<str>>) {
        strs
        .iter_mut()
        .for_each(|s| moo(s, &regex_replace));
    }
    
    fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) {
        let result = f(&c);
        match result {
            Cow::Owned(s) => *c = Cow::Owned(s),
            Cow::Borrowed(s) => {
                *c = Cow::Borrowed(s);
            }
        }
    }
    
    fn regex_replace(s: &str) -> Cow<str> {
        Regex::new(r"does-not-matter").unwrap().replace_all(s, "whatever")
    }
    ```
    
    But this doesn't pass `borrowck`. Instead, you could write `moo` like
    this:
    
    ```rust
    fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) {
        let result = f(&c);
        match result {
            Cow::Owned(s) => *c = Cow::Owned(s),
            Cow::Borrowed(s) => {
                if !std::ptr::eq(s, &**c) {
                    *c = Cow::Owned(s.to_owned())
                }
            }
        }
    }
    ```
    
    But the `std::ptr:eq` call here is a bit strange. Instead, after this PR
    and the new guarantee, one can write it like this:
    
    ```rust
    fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) {
        if let Cow::Owned(s) = f(&c) {
            *c = Cow::Owned(s);
        }
    }
    ```
    BurntSushi committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    088d7f3 View commit details
    Browse the repository at this point in the history
  2. 1.10.4

    BurntSushi committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    aa2d8bd View commit details
    Browse the repository at this point in the history
Loading