Commit Graph

3253 Commits

Author SHA1 Message Date
kylezhao 05e5e9907c GC: disable writing commit-graph for shallow repos
In shallow repos, GC writes to the commit-graph that shallow commits
do not have parents. This won't be true after a "git fetch --unshallow"
(and before another GC).

Do not write the commit-graph from shallow clones of a repo. The
commit-graph must have the real metadata of commits and that is not
available in a shallow view of the repo.

Change-Id: Ic9f2358ddaa607c74f4dbf289c9bf2a2f0af9ce0
Signed-off-by: kylezhao <kylezhao@tencent.com>
2023-01-06 13:13:13 -05:00
Matthias Sohn 70b436b1b2 Add TernarySearchTree
A ternary search tree is a type of tree where nodes are arranged in a
manner similar to a binary search tree, but with up to three children
rather than the binary tree's limit of two.

Each node of a ternary search tree stores a single character, a
reference to a value object and references to its three children named
equal kid, lo kid and hi kid. The lo kid pointer must point to a node
whose character value is less than the current node. The hi kid pointer
must point to a node whose character is greater than the current
node.[1] The equal kid points to the next character in the word. Each
node in a ternary search tree represents a prefix of the stored strings.
All strings in the middle subtree of a node start with that prefix.

Like other prefix trees, a ternary search tree can be used as an
associative map with the ability for incremental string search. Ternary
search trees are more space efficient compared to standard prefix trees,
at the cost of speed.

They allow efficient prefix search which is important to implement
searching refs by prefix in a RefDatabase.

Searching by prefix returns all keys if the prefix is an empty string.

Bug: 576165
Change-Id: If160df70151a8e1c1bd6716ee4968e4c45b2c7ac
2023-01-04 23:51:23 +01:00
kylezhao 414bfe05ff CommitGraph: teach ObjectReader to get commit-graph
FileRepository's ObjectReader#getCommitGraph will return commit-graph
when it exists and core.commitGraph is true.

DfsRepository is not supported currently.

Change-Id: I992d43d104cf542797e6949470e95e56de025107
Signed-off-by: kylezhao <kylezhao@tencent.com>
2023-01-04 14:50:38 +08:00
Ivan Frade 93ac99b52a Merge "CommitGraph: add commit-graph for FileObjectDatabase" 2023-01-03 14:56:53 -05:00
Thomas Wolf 9a6d602488 PatchApplier: fix handling of last newline in text patch
If the last line came from the patch, use the patch to determine whether
or not there should be a trailing newline. Otherwise use the old text.

Add test cases for
- no newline at end, last line not in patch hunk
- no newline at end, last line in patch hunk
- patch removing the last newline
- patch adding a newline at the end of file not having one

all for core.autocrlf false, true, and input.

Add a test case where the "no newline" indicator line is not the last
line of the last hunk. This can happen if the patch ends with removals
at the file end.

Bug: 581234
Change-Id: I09d079b51479b89400ad300d0662c1dcb50deab6
Also-by: Yuriy Mitrofanov <a2terminator@mail.ru>
Signed-off-by: Thomas Wolf <twolf@apache.org>
2022-12-26 11:51:25 +01:00
kylezhao 8a7348df69 CommitGraph: add commit-graph for FileObjectDatabase
This change makes JGit can read .git/objects/info/commit-graph file
and then get CommitGraph.

Loading a new commit-graph into memory requires additional time. After
testing, loading a copy of the Linux's commit-graph(1039139 commits)
is under 50ms.

Bug: 574368
Change-Id: Iadfdd6ed437945d3cdfdbe988cf541198140a8bf
Signed-off-by: kylezhao <kylezhao@tencent.com>
2022-12-23 13:06:06 +08:00
Thomas Wolf aeb74f63d4 Reformat PatchApplier and PatchApplierTest
Some lines were too long, unnecessary fully qualified class names,
and an assertEquals(actual, expected) when it should have been
assertEquals(expected, actual).

Change-Id: I3b3c46c963afe2fb82a79c1e93970e73778877e5
Signed-off-by: Thomas Wolf <twolf@apache.org>
2022-12-22 23:30:02 +01:00
Anna Papitto 9b7c3ac11f IO#readFully: provide overload that fills the full array
IO#readFully is often called with the intent to fill the destination
array from beginning to end. The redundant arguments for where to start
and stop filling are opportunities for bugs if specified incorrectly or
if not changed to match a changed array length.

Provide a overloaded method for filling the full destination array.

Change-Id: I964f18f4a061189cce1ca00ff0258669277ff499
Signed-off-by: Anna Papitto <annapapitto@google.com>
2022-12-19 10:26:41 -08:00
kylezhao b082c58e0f GC: Write commit-graph files when gc
If 'core.commitGraph' and 'gc.writeCommitGraph' are both true, then gc
will rewrite the commit-graph file when 'git gc' is run. Defaults to
false while the commit-graph feature matures.

Bug: 574368
Change-Id: Ic94cd69034c524285c938414610f2e152198e06e
Signed-off-by: kylezhao <kylezhao@tencent.com>
2022-12-16 11:11:45 -05:00
kylezhao 7016e2ddae CommitGraph: add core.commitGraph config
Change-Id: I3b5e735ebafba09ca18fd83da479c7950fa3ea8d
Signed-off-by: kylezhao <kylezhao@tencent.com>
2022-12-16 10:21:09 -05:00
Ivan Frade 6ea36794d1 Merge "Gc#deleteOrphans: avoid dependence on PackExt alphabetical ordering" 2022-12-16 08:20:24 -05:00
kylezhao 7b0f633b67 CommitGraph: implement commit-graph read
Git introduced a new file storing the topology and some metadata of
the commits in the repo (commitGraph). With this data, git can browse
commit history without parsing the pack, speeding up e.g.
reachability checks.

This change teaches JGit to read commit-graph-format file, following
the upstream format([1]).

JGit can read a commit-graph file from a buffered stream, which means
that we can provide this feature for both FileRepository and
DfsRepository.

[1] https://git-scm.com/docs/commit-graph-format/2.21.0

Bug: 574368
Change-Id: Ib5c0d6678cb242870a0f5841bd413ad3885e95f6
Signed-off-by: kylezhao <kylezhao@tencent.com>
2022-12-16 06:57:06 -05:00
Anna Papitto 5c6c374ff6 Gc#deleteOrphans: avoid dependence on PackExt alphabetical ordering
Deleting orphan files depends on .pack and .keep being reverse-sorted
to before the corresponding index files that could be orphans. The new
reverse index file extension (.rev) will break that frail dependency.

Rewrite Gc#deleteOrphans to avoid that dependence by tracking which pack
names have a .pack or .keep file and then deleting any index files that
without a corresponding one. This approach takes linear time instead of
the O(n logn) time needed for sorting.

Change-Id: If83c378ea070b8871d4b01ae008e7bf8270de763
Signed-off-by: Anna Papitto <annapapitto@google.com>
2022-12-15 11:54:11 -08:00
Matthias Sohn ebc1f7d65c commitgraph package: fix exports/imports, add @since tag for new API
Change-Id: I9175b1d796f91f5ba4e21d3418550ae451c054b0
2022-12-08 02:00:58 +01:00
kylezhao cf70e7cbe4 CommitGraph: implement commit-graph writer
Teach JGit to write a commit-graph formatted file by walking commit
graph from specified commit objects.

See: https://git-scm.com/docs/commit-graph-format/2.21.0

Bug: 574368
Change-Id: I34f9f28f8729080c275f86215ebf30b2d05af41d
Signed-off-by: kylezhao <kylezhao@tencent.com>
2022-12-06 20:34:46 +08:00
Han-Wen NIenhuys 1d5a6c77a6 Merge "Fix crashes on rare combination of file names" 2022-11-28 09:34:46 -05:00
Matthias Sohn 2e28f27c26 Prepare 6.5.0-SNAPSHOT builds
Change-Id: I4238b6181e96e22e540cf34802a332f868cb6dfb
2022-11-23 19:09:33 +01:00
Matthias Sohn bde09c185f UploadPackTest: ensure UploadPack is closed to fix resource leak
Change-Id: I4c8cf6041b4011934d338138d4531d190fdd6abb
2022-11-21 00:08:42 +01:00
Matthias Sohn cddff2b7fd Fix warnings in PatchApplierTest
- don't use final for method parameters
- fix hiding member warnigns

Change-Id: I73c386f669918d3291ee3380024c018483aa3c97
2022-11-20 20:41:20 +01:00
Matthias Sohn 143188e831 Fix boxing warnings in TransportTest
Change-Id: I7e6dc845b89899cff262fab77c3977dbef5eea02
2022-11-20 20:34:57 +01:00
Matthias Sohn f849b8d0d9 Silence warnings about unclosed BasePackPushConnection
Change-Id: If52e8462e6222dd58d1004dd5ac174a27d96d098
2022-11-20 20:32:37 +01:00
Anna Papitto accacc27a1 DfsStreamKey: Replace ForReverseIndex to separate metrics.
Keys used for identifying reverse indexes in the DfsBlockCache use a
custom subclass ForReverseIndex because there was no PackExt for them.
This conflates BlockCacheMetrics for reverse indexes with those for
packs, since the key falls back onto 0 when there is no extension.

Replace the custom ForReverseIndex with a DfsStreamKey usage to bring
keys for the new REVERSE_INDEX extension in line with INDEX and BITMAP
and separate reverse index and pack BlockCacheMetrics.

Change-Id: I305e2c16d2a8cb2a824855ea92e0c9a9b188fce5
Signed-off-by: Anna Papitto <annapapitto@google.com>
2022-11-17 12:38:36 -05:00
Thomas Wolf 1c886d92f6 RawText.isBinary(): handle complete buffer correctly
Make sure we always get consistent results, whether or not we have the
full data in the buffer.

Change-Id: Ieb379a0c375ad3dd352e63ac2f23bda6ef16c215
Signed-off-by: Thomas Wolf <twolf@apache.org>
2022-11-16 15:17:19 -05:00
Matthias Sohn 0fb9d26eff Merge branch 'stable-6.3'
* stable-6.3:
  [benchmarks] Remove profiler configuration
  Add SHA1 benchmark
  [benchmarks] Set version of maven-compiler-plugin to 3.8.1
  Fix running JMH benchmarks
  Add option to allow using JDK's SHA1 implementation
  Fix API breakage caused by extracting WorkTreeUpdater
  Extract Exception -> HTTP status code mapping for reuse
  Don't handle internal git errors as an HTTP error
  Ignore IllegalStateException if JVM is already shutting down
  Allow to perform PackedBatchRefUpdate without locking loose refs

Change-Id: Ib58879be292c54a2a7f4936ac0986997985c822b
2022-11-16 10:15:30 +01:00
Matthias Sohn f3e0e9d5a3 Merge branch 'stable-6.2' into stable-6.3
* stable-6.2:
  [benchmarks] Remove profiler configuration
  Add SHA1 benchmark
  [benchmarks] Set version of maven-compiler-plugin to 3.8.1
  Fix running JMH benchmarks
  Add option to allow using JDK's SHA1 implementation
  Ignore IllegalStateException if JVM is already shutting down

Change-Id: I9c1576011c11b4ff8f453d18d9e786cee59860fa
2022-11-16 09:56:08 +01:00
Matthias Sohn d588c2c9ad Merge branch 'stable-6.1' into stable-6.2
* stable-6.1:
  [benchmarks] Remove profiler configuration
  Add SHA1 benchmark
  [benchmarks] Set version of maven-compiler-plugin to 3.8.1
  Fix running JMH benchmarks
  Add option to allow using JDK's SHA1 implementation
  Ignore IllegalStateException if JVM is already shutting down

Change-Id: Ie433c46a01a0f33848d54ecf99b30a44ca01e286
2022-11-16 09:55:22 +01:00
Matthias Sohn 7f36943d0c Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
  [benchmarks] Remove profiler configuration
  Add SHA1 benchmark
  [benchmarks] Set version of maven-compiler-plugin to 3.8.1
  Fix running JMH benchmarks
  Add option to allow using JDK's SHA1 implementation
  Ignore IllegalStateException if JVM is already shutting down

Change-Id: I176419026c3f4fdd8ebd34c61468c1ec3482ff45
2022-11-16 09:54:28 +01:00
Matthias Sohn f1909615d3 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  [benchmarks] Remove profiler configuration
  Add SHA1 benchmark
  [benchmarks] Set version of maven-compiler-plugin to 3.8.1
  Fix running JMH benchmarks
  Add option to allow using JDK's SHA1 implementation
  Ignore IllegalStateException if JVM is already shutting down

Change-Id: I40105336f0b9e593a8a2c242a9557f854c274fdc
2022-11-16 00:15:17 +01:00
Matthias Sohn 59029aec30 Add option to allow using JDK's SHA1 implementation
The change If6da9833 moved the computation of SHA1 from the JVM's
JCE to a pure Java implementation with collision detection.
The extra security for public sites comes with a cost of slower
SHA1 processing compared to the native implementation in the JDK.

When JGit is used internally and not exposed to any traffic from
external or untrusted users, the extra cost of the pure Java SHA1
implementation can be avoided, falling back to the previous
native MessageDigest implementation.

Bug: 580310
Change-Id: Ic24c0ba1cb0fb6282b8ca3025ffbffa84035565e
2022-11-15 23:08:13 +01:00
Matthias Sohn f288de7490 Update Orbit to S20221109014815
and update
- com.sun.jna to 5.12.1.v20221103-2317
- com.sun.jna.platform to 5.12.1.v20221103-2317
- org.bouncycastle.bcpg to 1.72.0.v20221013-1810
- org.bouncycastle.bcpkix to 1.72.0.v20221013-1810
- org.bouncycastle.bcprov to 1.72.0.v20221013-1810
- org.bouncycastle.bcutil to 1.72.0.v20221013-1810
- org.mockito.mockito-core to 4.8.1.v20221103-2317
- org.objenesis to 3.3.0.v20221103-2317

Change-Id: If00094d23e51d5f66928f83c1334aa6b18b98dfe
2022-11-14 17:12:20 +01:00
Dmitrii Filippov 1e04046a6d Fix crashes on rare combination of file names
The NameConflictTreeWalk class is used in merge for iterating over
entries in commits. The class uses a separate iterator for each
commit's tree. In rare cases it can incorrectly report the same entry
twice. As a result, duplicated entries are added to the merge result
and later jgit throws an exception when it tries to process merge
result.

The problem appears only when there is a directory-file conflict for
the last item in trees. Example from the bug:
Commit 1:
* subtree - file
* subtree-0 - file
Commit 2:
* subtree - directory
* subtree-0 - file
Here the names are ordered like this:
"subtree" file <"subtree-0" file < "subtree" directory.

The NameConflictTreeWalk handles similar cases correctly if there are
other files after subtree... in commits - this is processed in the
AbstractTreeIterator.min function. Existing code has a special
optimization for the case, when all trees are pointed to the same
entry name - it skips additional checks. However, this optimization
incorrectly skips checks if one of trees reached the end.

The fix processes a situation when some trees reached the end, while
others are still point to an entry.

bug: 535919
Change-Id: I62fde3dd89779fac282479c093400448b4ac5c86
2022-11-03 14:09:56 -04:00
Josh Brown fe9aeb02e6 UploadPack: Receive and parse client session-id
Before this change JGit did not support the session-id capability
implemented by native Git in UploadPack. This change implements
advertising the capability from the server and parsing the session-id
received from the client during an UploadPack operation.

Enable the transfer.advertisesid config setting to advertise the
capability from the server. The client may send a session-id capability
in response. If received, the value from this is parsed and available
via the getClientSID method on the UploadPack object.

This change does not add the capability to send a session-id from the
JGit client.

https://git-scm.com/docs/gitprotocol-capabilities#_session_idsession_id

Change-Id: Ib1b6929ff1b3a4528e767925b5e5c44b5d18182f
Signed-off-by: Josh Brown <sjoshbrown@google.com>
2022-11-02 16:13:22 -04:00
Josh Brown 7b0a71a5e9 TransferConfig: Move reading advertisesid setting into TransferConfig
The config setting to enable advertising the session-id capability is
currently read in the ReceivePack class. This change moves it to a
common location in the TransferConfig class so that it can be reused
in other places like UploadPack. TransferConfig is also a more logical
place for the setting as it resides in the `transfer` config section.

Set the transfer.advertisesid setting to true to send the session-id
capability to the client.

Change-Id: If68ecb5e68b59f5c452a7992d02e3688b0a86747
Signed-off-by: Josh Brown <sjoshbrown@google.com>
2022-11-02 16:13:08 -04:00
Josh Brown e8068188f1 FirstWant: Parse client session-id if received.
In protocol V0 the client capabilities are appended to the first line.
Parsing session-id is currently only supported during a ReceivePack
operation. This change will parse the client session-id capability if
it has been sent by the client.

If the server sends the session-id capability to the client. The client
may respond with a session ID of its own. FirstWant.fromLine will now
parse the ID and make it available via the getClientSID method.

This change does not add support to send the session-id capability from
the server. The change is necessary to support session-id in UploadPack.

Change-Id: Id3fe44fdf9a72984ee3de9cf40cc4e71d434df4a
Signed-off-by: Josh Brown <sjoshbrown@google.com>
2022-11-02 20:12:03 +00:00
Josh Brown 93097f0018 ReceivePack: Receive and parse client session-id.
Before this change JGit did not support the session-id capability
implemented by native Git. This change implements advertising the
capability from the server and parsing the session-id received from
the client during a ReceivePack operation.

Enable the transfer.advertisesid config setting to advertise the
capability from the server. The client may send a session-id capability
in response. If received, the value from this is parsed and available
via the getClientSID method on the ReceivePack object. All capabilities
in the form `capability=value` are now split into key value pairs at the
first `=` character. This change replaces specific handling for the 
agent capability.

This change does not add advertisement or parsing to UploadPack. This
change also does not add the ability to send a session ID from the JGit
client.

https://git-scm.com/docs/protocol-v2/2.33.0#_session_idsession_id

Change-Id: I56fb115e843b11b27e128c4ac427b05d5ec129d0
Signed-off-by: Josh Brown <sjoshbrown@google.com>
2022-10-27 16:17:50 -04:00
kylezhao ad9c217f49 PushCommand: allow users to disable use of bitmaps for push
Reachability bitmaps are designed to speed up the "counting objects"
phase of generating a pack during a clone or fetch. They are not
optimized for Git clients sending a small topic branch via "git push".
In some cases (see [1]), using reachability bitmaps during "git push"
can cause significant performance regressions.

Add PushCommand#setUseBitmaps(boolean) to allow users to tell "git push"
not to use bitmaps.

[1]: https://lore.kernel.org/git/87zhoz8b9o.fsf@evledraar.gmail.com/

Change-Id: I7fb7d26084ec63ddfa7249cf58abb85929b30e56
Signed-off-by: kylezhao <kylezhao@tencent.com>
2022-10-21 08:11:33 +02:00
Thomas Wolf 71af0d6a5c I/O redirection for the pre-push hook
Fix and complete the implementation of calling the pre-push hook.
Add the missing error stream redirect, and add the missing setters
in Transport and in PushCommand. In Transport, delay setting up a
PrePushHook such that it happens only on a push. Previously, the
hook was set up also for fetches.

Bug: 549246
Change-Id: I64a576dfc6b139426f05d9ea6654027ab805734e
Signed-off-by: Thomas Wolf <twolf@apache.org>
2022-10-20 23:34:56 +02:00
Ivan Frade 96236fdcb5 PackParser: populate full size of the PackedObjectInfos
We need the full size of the objects to populate the object-size index
of a pack. This size is not always the one encoded in the object header
in the pack (e.g. for deltas).

Populate the full size of PackedObjectInfos in the PackParser, which is
invoked when receiving a pack e.g. in a push.

Change-Id: I102c20901aefb5e85047e2e526c0d733f82ff74b
2022-10-18 11:19:21 -07:00
Thomas Wolf f71fcbf36b CloneCommand: set HEAD also when not checking out
CloneCommand, when setNoCheckout(true) was set, did not set HEAD.
With C git, "git clone --no-checkout" does.

Change-Id: Ief3df7e904ce90829a6345a6c3e9ee6a68486ab0
Signed-off-by: Thomas Wolf <twolf@apache.org>
2022-09-18 19:43:40 +02:00
Han-Wen NIenhuys 21a497843c Merge "Split out ApplyCommand logic to PatchApplier class" 2022-09-15 04:16:48 -04:00
Nitzan Gur-Furman acde6c8f5b Split out ApplyCommand logic to PatchApplier class
PatchApplier now routes updates through the index. This has two
results:

* we can now execute patches in-memory.

* the JGit apply command will now always update the
index to match the working tree.

Change-Id: Id60a88232f05d0367787d038d2518c670cdb543f
Co-authored-by: Han-Wen Nienhuys <hanwen@google.com>
Co-authored-by: Nitzan Gur-Furman <nitzan@google.com>
2022-09-15 09:15:55 +02:00
Matthias Sohn 85182df267 Prepare 6.4.0-SNAPSHOT builds
Change-Id: I47ca5f1d0263caa0bfc7c303042360c6c5ac4dec
2022-09-14 13:56:40 +02:00
Matthias Sohn c7df1addf6 Merge branch 'stable-6.3'
* stable-6.3:
  Prepare 6.3.1-SNAPSHOT builds
  JGit v6.3.0.202209071007-r
  JGit v6.3.0.2022009070944-r
  [merge] Fix merge conflicts with symlinks
  Update DEPENDENCIES for 6.3.0
  Update tycho to 2.7.5
  Revert "Adds FilteredRevCommit that can overwrites its parents in the DAG."
  Revert "Option to pass start RevCommit to be blamed on to the BlameGenerator."
  Prepare 6.3.0-SNAPSHOT builds
  JGit v6.3.0.202208161710-m3

Change-Id: Ia9430fb516dca795e25064a190704b70689af364
2022-09-12 10:51:37 +02:00
Matthias Sohn fb377b09eb Prepare 6.3.1-SNAPSHOT builds
Change-Id: I44e159eca4131880d74d3078060e7e20f9b5ce76
2022-09-12 10:09:10 +02:00
yunjieli e7bffdfc48 DfsBundleWriter: Add test case about GC_REST pack.
Add a test case to make sure that the bundle writer writes objects in
GC_REST packs as well.

Signed-off-by: Yunjie Li <yunjieli@google.com>
Change-Id: Iba4d88c573aa1cda4505afbe2b83581a09a343df
2022-09-07 10:18:16 -07:00
Matthias Sohn 68e8ecc91b JGit v6.3.0.202209071007-r
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: Iea3fae9f6f6c5fb0a79f7684334a3e0059738c4f
2022-09-07 16:07:11 +02:00
Matthias Sohn f8104e25f1 JGit v6.3.0.2022009070944-r
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I3cc78dbcf8c7970e80bf1499751611110ec2b30b
2022-09-07 15:39:48 +02:00
Thomas Wolf a8e683fef6 [merge] Fix merge conflicts with symlinks
Previous code would do a content merge on symlinks, and write the merge
result to the working tree as a file. C git doesn't do this; it leaves
a symlink in the working tree unchanged, or in a delete-modify conflict
it would check out "theirs".

Moreover, previous code would write the merge result to the link target,
not to the link. This would overwrite an existing link target, or fail
if the link pointed to a directory.

In link/file conflicts or file/link conflicts, C git always puts the
file into the working tree.

Change conflict handling accordingly. Add tests for all the conflict
cases.

Bug: 580347
Change-Id: I3cffcb4bcf8e336a85186031fff23f0c4b6ee19d
Signed-off-by: Thomas Wolf <twolf@apache.org>
2022-09-07 15:02:02 +02:00
Matthias Sohn 7c4a5421cc Revert "Adds FilteredRevCommit that can overwrites its parents in the
DAG."

This reverts commit 6297491e8a.

This is done as a quick fix for the failure of egit tests caused by  the
introduction of FilteredRevCommit.

Bug: 580690
Change-Id: Ia6b651dd11b0a4b02d5e52247eb4bf13adf94e27
2022-09-06 10:40:26 +02:00
Matthias Sohn ee6334bccf Revert "Option to pass start RevCommit to be blamed on to the
BlameGenerator."

This reverts commit 5747bba48b.

This is done as a quick fix for the failure of egit tests caused by  the
introduction of FilteredRevCommit.

Bug: 580690
Change-Id: Ia0178bc2de4fc825a81207bbd7979bf3a386c955
2022-09-06 10:40:26 +02:00