rn-122: include answers from various community members

sivaraam · sivaraam · commit 0f012acfb174 · 2025-05-02T10:46:23.000+05:30
diff --git a/rev_news/drafts/edition-122.md b/rev_news/drafts/edition-122.md
@@ -189,11 +189,343 @@ This edition covers what happened during the months of March and April 2025.
 ## Community interview
 
 _Editor note: For Git's 20th anniversary, we are doing an exclusive collaborative
-community interview and curating answers from various community members. Also,
-there's a short Q&A with our zealous, inclusive and tireless maintainer that
-follows below._
+community interview and curating answers of various community members. Also,
+there's a [short Q&A](#short-qa-with-our-maintainer-junio-c-hamano) with our
+zealous, inclusive and tireless maintainer that follows below._
+
+
+- **What's your favorite Git trick or workflow that you wish more people
+  knew about?**
+
+  [_Thalia Rose_][thalia]: For rebase-heavy workflows, `git range-diff` is incredibly
+  useful. To compare against upstream, use `git range-diff @{u}...@`,
+  and to compare against the previous HEAD, use `git range-diff @{1}...@`.
+
+  [_Lucas Seiki Oshiro_][seiki]: Everything related to code archaeology
+  (`git grep`, `git log -S/-G`, `git log -L` and `git bisect`). Those are
+  my primary debugging tools and every time I explained them to other
+  people they find them mind-blowing and useful.
+  And they also started loving it :-)
+
+  [_Elijah Newren_][elijah]: [`range-diff`][range-diff]. The ideas behind
+  it ought to be the basis for code review, IMO.  Commits should be the
+  unit of review (including commit messages as a fundamental and primary
+  thing to be reviewed), and a series of commits should be the unit of
+  merging.  I dislike most code review tools, because they get one or
+  both of those things wrong. Getting both of those things right naturally
+  leads to `range-diff` or something like it being a very important part
+  of the workflow, at a minimum for detecting which commits in a series
+  are unmodified and which have been updated and need to be further reviewed.
+
+
+- **What was your worst Git disaster, and how did you recover from it?**
+
+  [_Thalia Rose_][thalia]: When I was first starting with Git, I wanted to make a repo
+  to preserve my first coding project when I was twelve, a bunch of VBS scripts.
+  I had assumed that Git maintained file modification timestamps, so I deleted
+  the originals because they were now redundant. I now no longer know exactly
+  when I wrote them and have been careful about timestamps ever since.
+
+  [_Luca Milanesio_][luca]: I suspect to be one of the worst offenders :-) [ [ref](https://www.infoq.com/news/2013/11/use-the-force) ]
+
+  Thankfully I was using Gerrit Code Review and the replication plugin:
+  the refs were not lost but just rewind and we could reset all the
+  correct SHA1s for all of them.
+
+  [_Lucas Seiki Oshiro_][seiki]: I don't remember something that I did,
+  but I remember a simple and curious disaster: our deploy workflows
+  stopped working, only leaving a message like "cannot fetch
+  ambiguous reference `master`". I decided to investigate what happened
+  and I found out that someone by mistake (I don't know how) created a
+  tag called `master` and pushed it to GitHub. By the time we used the
+  `master` branch for deploy, and the workflows didn't know if they
+  should use the `master` branch or tag. GitHub didn't have a feature
+  for deleting tags through the web interface, so we thought
+  "what should we do?".
+
+  The solution was to run `git push origin :refs/tags/master`. Simple,
+  but not obvious. A classic case where it only required a screw to be
+  turned, but all the hard work was to find which screw should be turned.
+
+  [_Elijah Newren_][elijah]:
+  My worst Git-related disaster wasn't with Git directly but with our
+  Git hosting software we used at a prior job, Gerrit.  'twas a
+  "startup" that was still forming good practices.  We had both a
+  production and a staging instance.  The staging instance was seeded
+  with a copy of production data so we could do scale testing...but that
+  seeding process was a multi-step manual thing; it hadn't been
+  automated.  One step was, as best I recall, "drop database gerrit",
+  followed by loading the production copy of the mysql database (this
+  was long before [NoteDB][notedb] arrived).  And as many readers
+  probably have guessed by now, I was on the wrong host one day when
+  I ran that command.
+
+  The actual git repositories were still intact, but the review metadata
+  was toast.  Luckily, we had a backup from about 7 hours earlier, so we
+  could recover the older review metadata and with some hackery fix the
+  mysql metadata mismatch with the newer repository contents.  And since
+  Gerrit emailed folks comments from reviews as they were posted, we
+  could tell people to look at their emails for the pieces we couldn't
+  recover.
+
+  It was a really long night trying to fix things.  Some folks told me
+  they thought I was going to throw up just looking at me.  But I
+  learned how wonderful it was to be at a company with blameless
+  post-mortems, and I appreciated the many folks who reached out to tell
+  me stories of mistakes they had made.  They were more interested in
+  whether we learned our lesson and put processes into place to prevent
+  repeats, and I definitely did both.
+
+  I did, of course, also get some good-natured ribbing, such as people
+  saying I got to play the part of little Bobby Tables once (see
+  [this xkcd comic][bobby-tables] if you don't know that reference).
+  I kindly reminded them that I didn't drop a table -- I dropped the whole
+  database (plus, it wasn't injection, it was just running a command in
+  the wrong location).  Also, one of my colleagues helpfully modified
+  the prompt on production to be red and bold, "This is PROD Gerrit",
+  and the prompt on staging to be green, "This is staging Gerrit; it's
+  okay to drop database here!"  The prompts ended up not mattering since
+  I automated the process, and made sure the process just error'ed out
+  if run on prod instead of staging.  But the prompt persisted for many
+  years anyway, because I thought it was a hilarious way to poke fun at
+  my blunder.
+
+
+- **If you could go back in time and change one design decision in Git,
+  what would it be?**
+
+  [_Luca Milanesio_][luca]: Use SHA-256 straight away, as it was
+  published 24 years ago and already existed at the time Git was designed.
+
+  [_Lucas Seiki Oshiro_][seiki]: Perhaps writing a more abstract CLI. After
+  studying Git a little more deeper it makes sense for me, but I would group
+  the functionality into more high-level subcommands and would make the flags
+  and options more consistent across the subcommands.
+
+  For example, Docker CLI have all the image operations under
+  `docker image` and all the network operations under `docker network`.
+  If I want to delete an image, I use `docker image rm`, if I want to
+  delete a network, I use `docker network rm`, and so on. I would make
+  Git CLI work based on that idea, for example:
+
+    - `git branch add my_branch`
+    - `git branch delete my_branch`
+    - `git branch list`
+    - `git remote add my_remote ...`
+    - `git remote delete my_remote`
+    - `git remote list`
+    - `git tag add my_tag`
+    - `git tag delete my_tag`
+    - `git tag list`
+
+  With some shorter alias, just like Docker has `docker rmi` and
+  `docker rm`.
+
+  [_Elijah Newren_][elijah]: The index.  For a few reasons.
+
+  1. Performance.
+     1. The index is pervasive throughout the codebase, and while it works
+     great for small repositories, it means that many operations are O(size
+     of repository) instead of O(size of changes).  [sparse indices][sparse-index]
+     help, but the code has to be carefully audited for sparse indices to
+     work with each codepath, and even then there tends to be a fallback of
+     just-load-everything-anyway because the data structure doesn't lend
+     nicely to just expanding a little more.
+
+     2. An under-appreciated aspect of the performance improvements that
+     came from our new merge strategy, [`merge-ort`][merge-ort], were due
+     to dispensing with the index as the primary data structure.  The index
+     had two problems:
+        1. first of all it meant loading every path in the repository,
+        which would have prevented ort's optimization to avoid recursing into
+        subtrees when unnecessary (an optimization that often made merges e.g.
+        50x faster).  Sparse indices didn't exist back then, but even if they
+        had we would have had to complicate them significantly in order to
+        have their sparseness be determined by renames and the intersection of
+        modified paths on the two sides of history instead of having
+        sparseness determined by user-defined path rules; I think that'd have
+        been much more complicated than just dispensing with the index as the
+        data structure, but we didn't even have sparse indices back then
+        anyway.
+
+        2. Second, the use of the index as done in the old merge strategy,
+        `merge-recursive`, resulted in O(N^2) behavior since entries (including
+        conflicted higher order stages) had to be inserted in sorted order.
+        Deleting entries didn't have the same O(N^2) problem due to some
+        tricks to queue the deletion for later, but attempting to do the same
+        for insertions was far from straightforward and I believe would have
+        required making some other data structure primary and then forming the
+        index at the end. (Note that the primary data structure used, whatever
+        it is, cannot just have a list of things to insert, it also needs to
+        be checked for various properties intermingled with insertions...and
+        those sometimes relied on the fact that the index was sorted for quick
+        lookups.) <br/><br />
+        (Note that a tree-structured index rather than a linear index would
+        resolve these problems.  But retrofitting the entire codebase is
+        probably never going to happen...)
+
+    2. Cognitive Complexity. <br/>The funny thing is, although I say this,
+    I use the index all the time. I use `git add -p` a lot.  I very much
+    need to slice and dice my changes into different commits, and tend to
+    have dirty changes that I don't want pushed. <br /> <br />
+    But slicing and dicing before things are committed, as opposed to
+    being able to slice and dice after, is a choice that adds a lot of
+    complexity to the user interface and does so even for users who aren't
+    interested in slicing and dicing commits.  We don't have a
+    sufficiently flexible set of tooling for slicing and dicing commits
+    after-the-fact within git to switch to a post-commit-slice-and-dice
+    workflow even today, but I suspect that some of the ideas from [JJ][jujutsu]
+    would or could be much better than the methods I use today in git to
+    slice and dice commits.
+
+
+- **Which Git feature or improvement over the past 20 years do you think
+  had the biggest impact on your workflow?**
+
+  [_Lucas Seiki Oshiro_][seiki]: Sorry, but I can't answer. I am from a
+  generation that started programming when Git was already the de facto
+  VCS so I can't compare a world that has it with a world that doesn't have.
+
+  [_Elijah Newren_][elijah]: Speed.
+
+  Being able to instantly switch branches (in smaller repos, sure, but
+  CVS and SVN couldn't pull it off even in small repos) was a game
+  changer.
+
+
+- **What Git problem that existed 10 years ago has been most
+  successfully solved?**
+
+  [_Lucas Seiki Oshiro_][seiki]: Sorry again, but 10 years ago I was only
+  starting to use Git and when I started to use more complex features they
+  already were there.
+
+  [_Elijah Newren_][elijah]: Merging and rebasing with lots of renames
+  (and generally merging without a worktree or index).  I'm obviously
+  a bit biased on this point, but that doesn't mean I'm wrong.  ;-)
+  It used to be awful and works great now.
+
+  Relatedly, merging without a worktree or index was problematic; you
+  had to either use an alternative merge strategy with limited
+  capabilities, or use something other than git (e.g. [libgit2][libgit2]).
+  But now git handles it well with its default merge strategy.
+
+
+- **Which Git commands or workflows do you think are still misunderstood
+  or underutilized today?**
+
+  [_Lucas Seiki Oshiro_][seiki]: I think [squash merges][squash-merge] and
+  [submodules][submodule] are really misunderstood, yet they are the opposite
+  of being underutilized. Sadly I saw several people using them in daily basis,
+  based on the wrong idea of what they are and then using them incorrectly.
+
+
+  What I think it is underutilized is the full power of commits of being
+  a good source of documentation and good resource for, again, performing
+  code archaeology that may help understanding what the code does and
+  debugging it. Several developers treat the commits as just checkpoints.
+
+  [_Elijah Newren_][elijah]: `range-diff` is very under-utilized, but I
+  already discussed that above.
+
+
+- **What's one Git based project, tool, or extension you think deserves
+  more recognition from the community?**
+
+  [_Lucas Seiki Oshiro_][seiki]: Perhaps it would be better to leave this
+  question for other less known tools. But if you want an answer, I think:
+
+   - [Delta](https://github.com/dandavison/delta) is a really cool to
+   format the diff-related outputs;
+
+   - [Kworkflow](https://kworkflow.org/) is a powerful tool for
+   contributing to the Linux kernel source code (I should also
+   try it for contributing to the Git source code);
+
+    - Merge drivers in general. `diff3` works in most cases but it is
+    only based on pure diffs, without performing deeper operations based
+    on the file format they are merging.
+
+
+- **What Git feature or capability surprised you most when you first
+  discovered it?**
+
+  [_Lucas Seiki Oshiro_][seiki]:  As you may have noticed, I'm really
+  a fan of Git archaeology :-), so I would say all that I mentioned
+  in the first answer (i.e., `git grep`, `git log -S/-G`, `git log -L`
+  and `git bisect`). But my favorite is still [bisect][bisect].
+  It's an egg of Columbus and everyone that I have shown it to
+  was equally amazed by it!
+
+
+- **What's your boldest prediction about how version control might look
+  in another 20 years?**
+
+  [_Lucas Seiki Oshiro_][seiki]: I still see Git as the dominant VCS
+  in the future, but I think more Git-based VCSs (like [Jujutsu][jujutsu]
+  will arise. Just like we have today programming languages built on top
+  of the stack of the other languages (e.g. Clojure, Kotlin and Scala on
+  JVM, TypeScript on JS), networking protocols written on top of other
+  protocols (e.g. QUIC on UDP, gRPC on HTTP) and so on.
+
+  The Git core is simple, flexible, transparent and powerful and there's
+  still room for people using it directly in several creative ways. Once
+  I saw [a project using it as a backend for a NoSQL database][git-backend-nosql],
+  who knows how many use cases we still have for it.
+
+  [_Elijah Newren_][elijah]: I'm more interested in what storms might be
+  brewing along that path, and what we might be able to do to avoid them.
+  In particular, some questions and observations in that area:
+
+  * With monorepos growing ever larger, do we have hard-to-workaround-or-fix
+    design decisions that pose scaling challenges?  e.g.
+      * the index data structure
+      * per-directory .gitignore files, per-directory .gitattribute files, etc.
+  * ...or do the prominent Git forges have hard-to-workaround-or-fix
+    design decisions that'll give Git a reputation for not scaling?  e.g.
+     * making refs/pull/NNN/merge a public ref and excessively
+       implicitly updating it
+  * Will we face a crisis of interest?  e.g.
+     * `git` is currently written in C.  Even if that's not a liability
+       already, coupled with "decades" I think it is.  Young developers
+       probably don't want to learn C, and older ones who already know C
+       may worry about C becoming a Fortran or Cobol.
+     * Companies employing Git developers think "git already won" and
+       redeploy those engineers on other problems
+  * Will the combination of issues above result in folks who want improvements
+    deciding their best bet is not improving Git but in creating/funding
+    an alternative? Will that snowball?
+
+  <br />
+  To me, the entry of new projects like [JJ][jujutsu] and [sapling][sapling]
+  suggest the above are real concerns already rather than just theoretical.
+  Both projects have compelling things that git lacks.  I like the friendly
+  competition, and the JJ and sapling developers are awesome to talk to
+  at Git Merge conferences.  But there is a risk that this friendly
+  competition mirrors that of Git and Mercurial from years past, and
+  that Git at some future point down the road ends up on the other side
+  of that history and gets largely displaced by the alternatives.  I'd
+  rather not see that happen, but I sometimes wonder if we're taking
+  enough measures to avoid marching towards such an outcome.
+
+
+[thalia]: https://discord.com/channels/1042895022950994071/1361310935427584213/1361316878819131452
+[luca]: https://public-inbox.org/git/04A328E9-1146-4D4A-84E7-456FFEB66A5A@gmail.com/
+[seiki]: https://public-inbox.org/git/AE27429C-97B1-4226-8F30-5B635A050498@gmail.com/
+[elijah]: https://public-inbox.org/git/CABPp-BH2yH4iJ28Bo7Q=uryu68LLk7a0Tvb2SzAbAiHK8QpRug@mail.gmail.com/
+[squash-merge]: https://git-scm.com/docs/git-merge#Documentation/git-merge.txt---squash
+[submodule]: https://git-scm.com/docs/git-submodule
+[bisect]: https://git-scm.com/docs/git-bisect
+[range-diff]: https://git-scm.com/docs/git-range-diff
+[sparse-index]: https://git-scm.com/docs/sparse-index
+[merge-ort]: https://git-scm.com/docs/merge-strategies#Documentation/merge-strategies.txt-ort
+[jujutsu]: https://github.com/jj-vcs/jj?tab=readme-ov-file#introduction
+[git-backend-nosql]: https://www.kenneth-truyers.net/2016/10/13/git-nosql-database
+[notedb]: https://www.gerritcodereview.com/notedb.html
+[bobby-tables]: https://xkcd.com/327/
+[libgit2]: https://libgit2.org/
+[sapling]: https://sapling-scm.com/
 
-TODO
 
 ### Short Q&A with our maintainer, Junio C Hamano