summaryrefslogtreecommitdiff
path: root/sys/src/cmd/git/git.h
AgeCommit message (Collapse)Author
2022-06-11git/get: keep sending what we have until we get an ackOri Bernstein
Git9 was sloppy about telling git what commits we have. We would list the commits at the tip of the branch, but not walk down it, which means we would request too much data if our local branches were ahead of the remote. This patch changes that, sending the tips *and* the first 256 commits after them, so that git can produce a better pack for us, with fewer redundant commits.
2022-05-28git: performance enhancementsOri Bernstein
Inspired by some changes made in game of trees, I've implemented a number of speedups in git9. First, hashing the chunks during deltification with murmurhash instead of sha1 speeds up the delta search significantly. The stretch function was micro-optimized a bit as well, since that was taking a large portion of the time when chunking. Finally, the full path is not stored. We only care about grouping files with the same name and path. We don't care about the ordering. Therefore, only the hash of the path xored with the hash of the diretory is kept, which saves a bunch of mallocs and string munging. This reduces the time spent repacking some test repos significantly. 9front: % time git/repack deltifying 97473 objects: 100% writing 97473 objects: 100% indexing 97473 objects: 100% 58.85u 1.39s 61.82r git/repack % time /sys/src/cmd/git/6.repack deltifying 97473 objects: 100% writing 97473 objects: 100% indexing 97473 objects: 100% 43.86u 1.29s 47.51r /sys/src/cmd/git/6.repack openbsd: % time git/repack deltifying 2092325 objects: 100% writing 2092325 objects: 100% indexing 2092325 objects: 100% 1589.48u 45.03s 1729.18r git/repack % time /sys/src/cmd/git/6.repack deltifying 2092325 objects: 100% writing 2092325 objects: 100% indexing 2092325 objects: 100% 1238.68u 41.49s 1373.15r /sys/src/cmd/git/6.repack go: % time git/repack deltifying 529507 objects: 100% writing 529507 objects: 100% indexing 529507 objects: 100% 345.32u 7.71s 369.25r git/repack % time /sys/src/cmd/git/6.repack deltifying 529507 objects: 100% writing 529507 objects: 100% indexing 529507 objects: 100% 248.07u 4.47s 257.59r /sys/src/cmd/git/6.repack
2022-03-17git: use commit date as traversal hint instead of author dateMichael Forney
Although git9 always uses the same commit date and author date, other implementation do make a distinction. Since commit date is more representative of the commit graph order, use this as a traversal hint instead of author date.
2022-03-16git/query: refactor graph painting algorithm (findtwixt, lca)Michael Forney
We now keep track of 3 sets during traversal: - keep: commits we've reached from head commits - drop: commits we've reached from tail commits - skip: ancestors of commits in both 'keep' and 'drop' Commits in 'keep' and/or 'drop' may be added later to the 'skip' set if we discover later that they are part of a common subgraph of the head and tail commits. From these sets we can calculate the commits we are interested in: lca commits are those in 'keep' and 'drop', but not in 'skip'. findtwixt commits are those in 'keep', but not in 'drop' or 'skip'. The "LCA" commit returned is a common ancestor such that there are no other common ancestors that can reach that commit. Although there can be multiple commits that meet this criteria, where one is technically lower on the commit-graph than the other, these cases only happen in complex merge arrangements and any choice is likely a decent merge base. Repainting is now done in paint() directly. When we find a boundary commit, we switch our paint color to 'skip'. 'skip' painting does not stop when it hits another color; we continue until we are left with only 'skip' commits on the queue. This fixes several mishandled cases in the current algorithm: 1. If we hit the common subgraph from tail commits first (if the tail commit was newer than the head commit), we ended up traversing the entire commit graph. This is because we couldn't distinguish between 'drop' commits that were part of the common subgraph, and those that were still looking for it. 2. If we traversed through an initial part of the common subgraph from head commits before reaching it from tail commits, these commits were returned from findtwixt even though they were also reachable from tail commits. 3. In the same case as 2, we might end up choosing an incorrect commit as the LCA, which is an ancestor of the real LCA.
2022-01-02git: size cache in bytes, not objectsOri Bernstein
git used to track cache size in object count, rather than bytes. This had the unfortunate effect of making memory use depend on the size of objects -- repos with lots of large objects could cause out of memory deaths. now, we track sizes in bytes, which should keep our memory usage flatter.
2021-09-11git/query: fix spurious merge requestsOri Bernstein
Due to the way LCA is defined, a using a strict LCA on a graph like this: <--a--b--c--d--e--f--g \ / +-----h------- can lead to spurious requests to merge. This happens because 'lca(b, g)' would return 'a', since it can be reached in one step from 'b', and 2 steps from 'g', while reaching 'b' from 'a' would be a longer path. As a result, we need to implement an lca variant that returns the starting node if one is reachable from the other, even if it's already found the technically correct least common ancestor. This replaces our LCA algorithm with one based on the painting we do while finding a twixt, making it give the resutls we want. git/query: fix spurious merge requests Due to the way LCA is defined, a using a strict LCA on a graph like this: <--a--b--c--d--e--f--g \ / +-----h------- can lead to spurious requests to merge. This happens because 'lca(b, g)' would return 'a', since it can be reached in one step from 'b', and 2 steps from 'g', while reaching 'b' from 'a' would be a longer path. As a result, we need to implement an lca variant that returns the starting node if one is reachable from the other, even if it's already found the technically correct least common ancestor. This replaces our LCA algorithm with one based on the painting we do while finding a twixt.
2021-05-16git: got git?Ori Bernstein
Add a snapshot of git9 to 9front.