Skip to content
/ git Public
forked from git/git

Commit 4f36b85

Browse files
pks-tgitster
authored andcommitted
reftable/stack: fix race in up-to-date check
In 6fdfaf1 (reftable/stack: use stat info to avoid re-reading stack list, 2024-01-11) we have introduced a new mechanism to avoid re-reading the table list in case stat(3P) figures out that the stack didn't change since the last time we read it. While this change significantly improved performance when writing many refs, it can unfortunately lead to false negatives in very specific scenarios. Given two processes A and B, there is a feasible sequence of events that cause us to accidentally treat the table list as up-to-date even though it changed: 1. A reads the reftable stack and caches its stat info. 2. B updates the stack, appending a new table to "tables.list". This will both use a new inode and result in a different file size, thus invalidating A's cache in theory. 3. B decides to auto-compact the stack and merges two tables. The file size now matches what A has cached again. Furthermore, the filesystem may decide to recycle the inode number of the file we have replaced in (2) because it is not in use anymore. 4. A reloads the reftable stack. Neither the inode number nor the file size changed. If the timestamps did not change either then we think the cached copy of our stack is up-to-date. In fact, the commit introduced three related issues: - Non-POSIX compliant systems may not report proper `st_dev` and `st_ino` values in stat(3P), which made us rely solely on the file's potentially coarse-grained mtime and ctime. - `stat_validity_check()` and friends may end up not comparing `st_dev` and `st_ino` depending on the "core.checkstat" config, again reducing the signal to the mtime and ctime. - `st_ino` can be recycled, rendering the check moot even on POSIX-compliant systems. Given that POSIX defines that "The st_ino and st_dev fields taken together uniquely identify the file within the system", these issues led to the most important signal to establish file identity to be ignored or become useless in some cases. Refactor the code to stop using `stat_validity_check()`. Instead, we manually stat(3P) the file descriptors to make relevant information available. On Windows and MSYS2 the result will have both `st_dev` and `st_ino` set to 0, which allows us to address the first issue by not using the stat-based cache in that case. It also allows us to make sure that we always compare `st_dev` and `st_ino`, addressing the second issue. The third issue of inode recycling can be addressed by keeping the file descriptor of "files.list" open during the lifetime of the reftable stack. As the file will still exist on disk even though it has been unlinked it is impossible for its inode to be recycled as long as the file descriptor is still open. This should address the race in a POSIX-compliant way. The only real downside is that this mechanism cannot be used on non-POSIX-compliant systems like Windows. But we at least have the second-level caching mechanism in place that compares contents of "files.list" with the currently loaded list of tables. This new mechanism performs roughly the same as the previous one that relied on `stat_validity_check()`: Benchmark 1: update-ref: create many refs (HEAD~) Time (mean ± σ): 4.754 s ± 0.026 s [User: 2.204 s, System: 2.549 s] Range (min … max): 4.694 s … 4.802 s 20 runs Benchmark 2: update-ref: create many refs (HEAD) Time (mean ± σ): 4.721 s ± 0.020 s [User: 2.194 s, System: 2.527 s] Range (min … max): 4.691 s … 4.753 s 20 runs Summary update-ref: create many refs (HEAD~) ran 1.01 ± 0.01 times faster than update-ref: create many refs (HEAD) Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 456333e commit 4f36b85

File tree

3 files changed

+95
-9
lines changed

3 files changed

+95
-9
lines changed

reftable/stack.c

+92-7
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ int reftable_new_stack(struct reftable_stack **dest, const char *dir,
6666
strbuf_addstr(&list_file_name, "/tables.list");
6767

6868
p->list_file = strbuf_detach(&list_file_name, NULL);
69+
p->list_fd = -1;
6970
p->reftable_dir = xstrdup(dir);
7071
p->config = config;
7172

@@ -175,7 +176,12 @@ void reftable_stack_destroy(struct reftable_stack *st)
175176
st->readers_len = 0;
176177
FREE_AND_NULL(st->readers);
177178
}
178-
stat_validity_clear(&st->list_validity);
179+
180+
if (st->list_fd >= 0) {
181+
close(st->list_fd);
182+
st->list_fd = -1;
183+
}
184+
179185
FREE_AND_NULL(st->list_file);
180186
FREE_AND_NULL(st->reftable_dir);
181187
reftable_free(st);
@@ -375,11 +381,59 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
375381
sleep_millisec(delay);
376382
}
377383

378-
stat_validity_update(&st->list_validity, fd);
379-
380384
out:
381-
if (err)
382-
stat_validity_clear(&st->list_validity);
385+
/*
386+
* Invalidate the stat cache. It is sufficient to only close the file
387+
* descriptor and keep the cached stat info because we never use the
388+
* latter when the former is negative.
389+
*/
390+
if (st->list_fd >= 0) {
391+
close(st->list_fd);
392+
st->list_fd = -1;
393+
}
394+
395+
/*
396+
* Cache stat information in case it provides a useful signal to us.
397+
* According to POSIX, "The st_ino and st_dev fields taken together
398+
* uniquely identify the file within the system." That being said,
399+
* Windows is not POSIX compliant and we do not have these fields
400+
* available. So the information we have there is insufficient to
401+
* determine whether two file descriptors point to the same file.
402+
*
403+
* While we could fall back to using other signals like the file's
404+
* mtime, those are not sufficient to avoid races. We thus refrain from
405+
* using the stat cache on such systems and fall back to the secondary
406+
* caching mechanism, which is to check whether contents of the file
407+
* have changed.
408+
*
409+
* On other systems which are POSIX compliant we must keep the file
410+
* descriptor open. This is to avoid a race condition where two
411+
* processes access the reftable stack at the same point in time:
412+
*
413+
* 1. A reads the reftable stack and caches its stat info.
414+
*
415+
* 2. B updates the stack, appending a new table to "tables.list".
416+
* This will both use a new inode and result in a different file
417+
* size, thus invalidating A's cache in theory.
418+
*
419+
* 3. B decides to auto-compact the stack and merges two tables. The
420+
* file size now matches what A has cached again. Furthermore, the
421+
* filesystem may decide to recycle the inode number of the file
422+
* we have replaced in (2) because it is not in use anymore.
423+
*
424+
* 4. A reloads the reftable stack. Neither the inode number nor the
425+
* file size changed. If the timestamps did not change either then
426+
* we think the cached copy of our stack is up-to-date.
427+
*
428+
* By keeping the file descriptor open the inode number cannot be
429+
* recycled, mitigating the race.
430+
*/
431+
if (!err && fd >= 0 && !fstat(fd, &st->list_st) &&
432+
st->list_st.st_dev && st->list_st.st_ino) {
433+
st->list_fd = fd;
434+
fd = -1;
435+
}
436+
383437
if (fd >= 0)
384438
close(fd);
385439
free_names(names);
@@ -396,8 +450,39 @@ static int stack_uptodate(struct reftable_stack *st)
396450
int err;
397451
int i = 0;
398452

399-
if (stat_validity_check(&st->list_validity, st->list_file))
400-
return 0;
453+
/*
454+
* When we have cached stat information available then we use it to
455+
* verify whether the file has been rewritten.
456+
*
457+
* Note that we explicitly do not want to use `stat_validity_check()`
458+
* and friends here because they may end up not comparing the `st_dev`
459+
* and `st_ino` fields. These functions thus cannot guarantee that we
460+
* indeed still have the same file.
461+
*/
462+
if (st->list_fd >= 0) {
463+
struct stat list_st;
464+
465+
if (stat(st->list_file, &list_st) < 0) {
466+
/*
467+
* It's fine for "tables.list" to not exist. In that
468+
* case, we have to refresh when the loaded stack has
469+
* any readers.
470+
*/
471+
if (errno == ENOENT)
472+
return !!st->readers_len;
473+
return REFTABLE_IO_ERROR;
474+
}
475+
476+
/*
477+
* When "tables.list" refers to the same file we can assume
478+
* that it didn't change. This is because we always use
479+
* rename(3P) to update the file and never write to it
480+
* directly.
481+
*/
482+
if (st->list_st.st_dev == list_st.st_dev &&
483+
st->list_st.st_ino == list_st.st_ino)
484+
return 0;
485+
}
401486

402487
err = read_lines(st->list_file, &names);
403488
if (err < 0)

reftable/stack.h

+3-1
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,10 @@ license that can be found in the LICENSE file or at
1414
#include "reftable-stack.h"
1515

1616
struct reftable_stack {
17-
struct stat_validity list_validity;
17+
struct stat list_st;
1818
char *list_file;
19+
int list_fd;
20+
1921
char *reftable_dir;
2022
int disable_auto_compact;
2123

reftable/system.h

-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ license that can be found in the LICENSE file or at
1212
/* This header glues the reftable library to the rest of Git */
1313

1414
#include "git-compat-util.h"
15-
#include "statinfo.h"
1615
#include "strbuf.h"
1716
#include "hash-ll.h" /* hash ID, sizes.*/
1817
#include "dir.h" /* remove_dir_recursively, for tests.*/

0 commit comments

Comments
 (0)