Commit Graph

8742 Commits

Author SHA1 Message Date
Matthias Sohn da21265a14 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  Shortcut during git fetch for avoiding looping through all local refs
  FetchCommand: fix fetchSubmodules to work on a Ref to a blob
  Silence API warnings introduced by I466dcde6
  Allow the exclusions of refs prefixes from bitmap
  PackWriterBitmapPreparer: do not include annotated tags in bitmap
  BatchingProgressMonitor: avoid int overflow when computing percentage
  Speedup GC listing objects referenced from reflogs
  FileSnapshotTest: Add more MISSING_FILE coverage

Change-Id: I58ad4c210a5e7e5a1ba6b22315b04211c8909950
2023-02-01 00:33:20 +01:00
Luca Milanesio 21e902dd7f Shortcut during git fetch for avoiding looping through all local refs
The FetchProcess needs to verify that all the refs received point
to objects that are reachable from the local refs, which could be
very expensive but is needed to avoid missing objects exceptions
because of broken chains.

When the local repository has a lot of refs (e.g. millions) and the
client is fetching a non-commit object (e.g. refs/sequences/changes in
Gerrit) the reachability check on all local refs can be very expensive
compared to the time to fetch the remote ref.

Example for a 2M refs repository:
- fetching a single non-commit object: 50ms
- checking the reachability of local refs: 30s

A ref pointing to a non-commit object doesn't have any parent or
successor objects, hence would never need to have a reachability check
done. Skipping the askForIsComplete() altogether would save the 30s
time spent in an unnecessary phase.

Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Change-Id: I09ac66ded45cede199ba30f9e71cc1055f00941b
2023-02-01 00:07:45 +01:00
Matthias Sohn 7650832002 FetchCommand: fix fetchSubmodules to work on a Ref to a blob
FetchCommand#fetchSubmodules assumed that FETCH_HEAD can always be
parsed as a tree. This isn't true if it refers to a Ref referring to a
BLOB. This is e.g. used in Gerrit for Refs like refs/sequences/changes
which are used to implement sequences stored in git.

Change-Id: I414f5b7d9f2184b2d7d53af1dfcd68cccb725ca4
2023-01-31 23:52:20 +01:00
Matthias Sohn 8040936f8a Silence API warnings introduced by I466dcde6
Change-Id: I510510da34d33757c2f83af8cd1e26f6206a486a
2023-01-31 23:45:07 +01:00
Luca Milanesio ad977f1572 Allow the exclusions of refs prefixes from bitmap
When running a GC.repack() against a repository with over one
thousands of refs/heads and tens of millions of ObjectIds,
the calculation of all bitmaps associated with all the refs
would result in an unreasonable big file that would take up to
several hours to compute.

Test scenario: repo with 2500 heads / 10M obj Intel Xeon E5-2680 2.5GHz
Before this change: 20 mins
After this change and 2300 heads excluded: 10 mins (90s for bitmap)

Having such a large bitmap file is also slow in the runtime
processing and have negligible or even negative benefits, because
the time lost in reading and decompressing the bitmap in memory
would not be compensated by the time saved by using it.

It is key to preserve the bitmaps for those refs that are mostly
used in clone/fetch and give the ability to exlude some refs
prefixes that are known to be less frequently accessed, even
though they may actually be actively written.

Example: Gerrit sandbox branches may even be actively
used and selected automatically because its commits are very
recent, however, they may bloat the bitmap, making it ineffective.

A mono-repo with tens of thousands of developers may have
a relatively small number of active branches where the
CI/CD jobs are continuously fetching/cloning the code. However,
because Gerrit allows the use of sandbox branches, the
total number of refs/heads may be even tens to hundred
thousands.

Change-Id: I466dcde69fa008e7f7785735c977f6e150e3b644
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
2023-01-31 17:14:09 -05:00
Luca Milanesio e4529cd39c PackWriterBitmapPreparer: do not include annotated tags in bitmap
The annotated tags should be excluded from the bitmap associated
with the heads-only packfile. However, this was not happening
because of the check of exclusion of the peeled object instead
of the objectId to be excluded from the bitmap.

Sample use-case:

refs/heads/main
  ^
  |
 commit1 <-- commit2 <- annotated-tag1 <- tag1
  ^
  |
 commit0

When creating a bitmap for the above commit graph, before this
change all the commits are included (3 bitmaps), which is
incorrect, because all commits reachable from annotated tags
should not be included.

The heads-only bitmap should include only commit0 and commit1
but because PackWriterBitPreparer was checking for the peeled
pointer of tag1 to be excluded (commit2) which was not found in
the list of tags to exclude (annotated-tag1), the commit2 was
included, even if it wasn't reachable only from the head.

Add an additional check for exclusion of the original objectId
for allowing the exclusion of annotated tags and their pointed
commits. Add one specific test associated with an annotated tag
for making sure that this use-case is covered also.

Example repository benchmark for measuring the improvement:
# refs: 400k (2k heads, 88k tags, 310k changes)
# objects: 11M (88k of them are annotate tags)
# packfiles: 2.7G

Before this change:
GC time: 5h
clone --bare time: 7 mins

After this change:
GC time: 20 mins
clone --bare time: 3 mins

Bug: 581267
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Change-Id: Iff2bfc6587153001837220189a120ead9ac649dc
2023-01-31 14:15:56 +01:00
Matthias Sohn 611412a055 BatchingProgressMonitor: avoid int overflow when computing percentage
When cloning huge repositories I observed percentage of object counts
turning negative. This happened if lastWork * 100 exceeded
Integer.MAX_VALUE.

Change-Id: Ic5f5cf5a911a91338267aace4daba4b873ab3900
2023-01-31 14:15:53 +01:00
Matthias Sohn cd3fc7a299 Speedup GC listing objects referenced from reflogs
GC needs to get a ReflogReader for all existing refs to list all objects
referenced from reflogs. The existing Repository#getReflogReader method
accepts the ref name and then resolves the Ref to create a ReflogReader.
GC calling that for a huge number of Refs one by one is very slow. GC
first gets all Refs in bulk and then calls getReflogReader for each of
them.

Fix this by adding another getReflogReader method to Repository which
accepts a Ref directly.

This speeds up running JGit gc on a mirror clone of the Gerrit
repository from 15:36 min to 1:08 min. The repository used in this test
had 45k refs, 275k commits and 1.2m git objects.

Change-Id: I474897fdc6652923e35d461c065a29f54d9949f4
2023-01-23 17:19:14 +01:00
Nasser Grainawi 2011fe06d2 FileSnapshotTest: Add more MISSING_FILE coverage
Add a couple tests that confirm what the docs say about isModified() and
equals(MISSING_FILE) behavior.

Change-Id: I6093040ba3594934c3270331405a44b2634b97c5
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
2023-01-06 14:23:12 -07:00
Matthias Sohn 41b33a16b8 Silence API errors
Change-Id: Ie112b2099ea2125bc85863524e56f09ba4907373
2022-11-20 19:55:22 +01:00
Matthias Sohn 12f48276bd Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  Silence API warnings

Change-Id: If5ab988a0e177c37b125e0b10625e506eeb2a74f
2022-11-20 19:54:44 +01:00
Matthias Sohn aa9f736c33 Silence API warnings
introduced by
- addition of configurable SHA1 implementation in 5.13.2
- 3-digit @since 5.9.1 annotations on GitServlet methods

Change-Id: If19853fcc5e3677e5b18e8e3fbbcd2773378dffc
2022-11-20 19:45:54 +01:00
Matthias Sohn f1909615d3 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  [benchmarks] Remove profiler configuration
  Add SHA1 benchmark
  [benchmarks] Set version of maven-compiler-plugin to 3.8.1
  Fix running JMH benchmarks
  Add option to allow using JDK's SHA1 implementation
  Ignore IllegalStateException if JVM is already shutting down

Change-Id: I40105336f0b9e593a8a2c242a9557f854c274fdc
2022-11-16 00:15:17 +01:00
Matthias Sohn 48316309b1 [benchmarks] Remove profiler configuration
Profiler configuration can be added when required. It was commented out
in most benchmarks.

Change-Id: I725f98757f7d4d2ba2589658e34e2fd6fbbbedee
2022-11-15 23:08:20 +01:00
Matthias Sohn c20e9676c4 Add SHA1 benchmark
Results on a Mac M1 max:

    size     SHA1Native SHA1Java    SHA1Java
                        without     with
                        collision   collision
                        detection   detection
    [kB]     [us/op]    [us/op]     [us/op]
---------------------------------------------
      1       3.662       4.200       4.707
      2       7.053       7.868       8.928
      4      13.883      15.149      17.608
      8      27.225      30.049      35.237
     16      54.014      59.655      70.867
     32     106.457     118.022     140.403
     64     212.712     237.702     281.522
   1024    3469.519    3868.883    4637.287
 131072  445011.724  501751.992  604061.308
1048576 3581702.104 4008087.854 4831023.563

The last 3 sizes (1, 128, 1024 MB) weren't committed
here to limit the total runtime.

Bug: 580310
Change-Id: I7d0382fd4aa4c4734806b12e96b671bee37d26e3
2022-11-15 23:08:19 +01:00
Matthias Sohn 4fac796588 [benchmarks] Set version of maven-compiler-plugin to 3.8.1
Change-Id: Ib14db133c76a55358ea79663ef38d9fb47a67f45
2022-11-15 23:08:19 +01:00
Matthias Sohn 8d53a1433d Fix running JMH benchmarks
Without this I sometimes hit the error:

$ java -jar target/benchmarks.jar
Exception in thread "main" java.lang.RuntimeException: ERROR: Unable to
find the resource: /META-INF/BenchmarkList
	at org.openjdk.jmh.runner.AbstractResourceReader.getReaders(AbstractResourceReader.java:98)
	at org.openjdk.jmh.runner.BenchmarkList.find(BenchmarkList.java:124)
	at org.openjdk.jmh.runner.Runner.internalRun(Runner.java:253)
	at org.openjdk.jmh.runner.Runner.run(Runner.java:209)
	at org.openjdk.jmh.Main.main(Main.java:71)

Change-Id: Iea9431d5f332f799d55a8a2d178c79a2ef0da22b
2022-11-15 23:08:19 +01:00
Matthias Sohn 59029aec30 Add option to allow using JDK's SHA1 implementation
The change If6da9833 moved the computation of SHA1 from the JVM's
JCE to a pure Java implementation with collision detection.
The extra security for public sites comes with a cost of slower
SHA1 processing compared to the native implementation in the JDK.

When JGit is used internally and not exposed to any traffic from
external or untrusted users, the extra cost of the pure Java SHA1
implementation can be avoided, falling back to the previous
native MessageDigest implementation.

Bug: 580310
Change-Id: Ic24c0ba1cb0fb6282b8ca3025ffbffa84035565e
2022-11-15 23:08:13 +01:00
Matthias Sohn 924491d4df Ignore IllegalStateException if JVM is already shutting down
Trying to register/unregister a shutdown hook when the JVM is already in
shutdown throws an IllegalStateException. Ignore this exception since we
can't do anything about it.

Bug: 580953
Change-Id: I8fc6fdd5585837c81ad0ebd6944430856556d90e
2022-10-27 20:31:58 +02:00
Matthias Sohn 9f7d77b608 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  UploadPack: don't prematurely terminate timer in case of error
  Do not create reflog for remote tracking branches during clone
  UploadPack: do not check reachability of visible SHA1s
  Add missing package import javax.management to org.eclipse.jgit

Change-Id: I6db0a4d74399fde892eeec62efd2946f97547a5d
2022-07-06 16:59:30 +02:00
Matthias Sohn 035e0e23f2 UploadPack: don't prematurely terminate timer in case of error
In uploadWithExceptionPropagation don't prematurely terminate timer in
case of error to enable reporting it to the client. Expose a close
method so that callers can terminate it at the appropriate time.

If the timer is already terminated when trying to report it to the
client this failed with the error java.lang.IllegalStateException:
"Timer already terminated".

Bug: 579670
Change-Id: I95827442ccb0f9b1ede83630cf7c51cf619c399a
2022-06-30 14:45:31 +02:00
Matthias Sohn ca6b518432 Merge "Do not create reflog for remote tracking branches during clone" into stable-5.13 2022-06-26 15:36:03 -04:00
Luca Milanesio 4bb4693633 Do not create reflog for remote tracking branches during clone
When using JGit on a non-bare repository, the CloneCommand
it previously created local reflogs for all branches including remote
tracking ones, causing the generation of a potentially large
number of files on the local filesystem.

The creation of the remote-tracking branches (refs/remotes/*) during
clone is not an issue for the local filesystem because all of them are
stored in a single packed-refs file. However, the creation of a large
number of ref logs on a local filesystem IS an issue because it
may not be tuned or initialised in term of inodes to contain a very
large number of files.

When a user (or a CI system) performs the CloneCommand against
a potentially large repository (e.g., millions of branches), it is
interested in working or validating a single branch or tag and is
unlikely to work with all the remote-tracking branches.
The eager creation of a reflogs for all the remote-tracking branches is
not just a performance issue but may also compromise the ability to
use JGit for cloning a large repository.

The behaviour implemented in this change is also consistent with the
optimisation done in the C code-base [1].

We differentiate between clone and fetch commands using --branch
<initialBranch> option, that is only available in clone command,
and is set as HEAD per default.

[1] 58f233ce1e

Bug: 579805
Change-Id: I58d0d36a8a4ce42e0f59b8bf063747c4b81bd859
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
2022-06-25 12:09:01 +01:00
Luca Milanesio 66ace4b9af UploadPack: do not check reachability of visible SHA1s
When JGit needs to serve a Git client requesting SHA1s
during the want phase, it needs to make a full reachability
check from the advertised refs to the ones requested to
keep all objects in the correct scope of confidentiality
allowed by the avertised refs.

The check is also performed when the SHA1 corresponds to
one of the tips of the advertised refs which is a waste of
resources.

Example:

fetch> ref-prefix refs/heads/foo
fetch< 900505eb8ce8ced2a1757906da1b25c357b9654e refs/heads/foo
fetch< 0000
fetch> command=fetch
fetch> 0001
fetch> thin-pack
fetch> ofs-delta
fetch> want 900505eb8ce8ced2a1757906da1b25c357b9654e

The SHA1 in the want is the tip of refs/heads/foo and therefore
the full reachability check can be shortened and resolved more
quickly.

Change-Id: I49bd9e2464e0bd3bca2abf14c6e9df550d07383b
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
2022-06-25 06:45:05 -04:00
Matthias Sohn f4cbf31ae4 Add missing package import javax.management to org.eclipse.jgit
Class org.eclipse.jgit.util.Monitoring uses JMX hence we need this
import otherwise OSGi applications can face ClassNotFoundException.

Bug: 577018
Change-Id: Ifd75337b87c7faec95d333b771bb0a2f3e46a418
2022-06-17 13:49:59 +02:00
Matthias Sohn d961bb6502 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  Prepare 5.13.2-SNAPSHOT builds
  JGit v5.13.1.202206130422-r
  AmazonS3: Add support for AWS API signature version 4

Change-Id: Ibd663a1d874d1aac274abc3dd44354fd99f64c39
2022-06-15 16:31:38 +02:00
Matthias Sohn db074a1352 Prepare 5.13.2-SNAPSHOT builds
Change-Id: I4862e5d80a7d95a1a119d06306e3f6927445d1d3
2022-06-14 00:41:18 +02:00
Matthias Sohn b34961a493 JGit v5.13.1.202206130422-r
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: Ife74d64e8171c68dbf08271492c0ac852a6dc51c
2022-06-13 10:22:43 +02:00
eric.steele e9a5430c25 AmazonS3: Add support for AWS API signature version 4
Updating the AmazonS3 class to support AWS Signature version 4 because
version 2 is no longer supported in all AWS regions. The version can be
selected with the new 'aws.api.signature.version' property (defaults to
2 for backwards compatibility). When set to '4', the user must also
specify the AWS region via the 'region' property. The 'region' property
must match the region that the 'domain' property resolves to.

Bug: 579907
Change-Id: If289dbc6d0f57323cfeaac2624c4eb5028f78d13
2022-06-13 09:44:23 +02:00
Matthias Sohn a96645a5f3 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  Fix connection leak for smart http connections

Change-Id: Ic851f2c4660ed761f5527e405b116b54da42fb7c
2022-06-07 11:35:13 +02:00
Matthias Sohn 5efd32e91d Merge branch 'stable-5.12' into stable-5.13
* stable-5.12:
  Fix connection leak for smart http connections

Change-Id: Id34f29c1b27a80c2b56c911cad7e3f64ef63af48
2022-06-07 11:34:25 +02:00
Matthias Sohn 8bb17e518f Merge branch 'stable-5.11' into stable-5.12
* stable-5.11:
  Fix connection leak for smart http connections

Change-Id: I6caabf4774ccf34706cef846c1087710f67e2ecd
2022-06-07 11:33:38 +02:00
Matthias Sohn c7335f32e9 Merge branch 'stable-5.10' into stable-5.11
* stable-5.10:
  Fix connection leak for smart http connections

Change-Id: I3885c6114caed897f762f5ce523d3b27288205b2
2022-06-07 10:53:24 +02:00
Matthias Sohn 85011b8b07 Merge branch 'stable-5.9' into stable-5.10
* stable-5.9:
  Fix connection leak for smart http connections

Change-Id: I5e7144b2f5cd850978220c476947001ae2debb8e
2022-06-07 10:42:22 +02:00
Saša Živkov 011c26ff36 Fix connection leak for smart http connections
SmartHttpPushConnection: close InputStream and OutputStream after
processing. Wrap IOExceptions which aren't TransportExceptions already
as a TransportException.

Also-By: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I8e11d899672fc470c390a455dc86367e92ef9076
2022-06-06 08:14:18 +02:00
Matthias Sohn 9612aae885 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  Remove stray files (probes or lock files) created by background threads

Change-Id: I7af1355a77f14995118145162f6bb8a4f1755f2b
2022-05-27 16:20:28 +02:00
James Z.M. Gao d67ac798f1 Remove stray files (probes or lock files) created by background threads
NOTE: port back from master branch.

On process exit, it was possible that the filesystem timestamp
resolution measurement left behind .probe files or even a lock file
for the jgit.config.

Ensure the SAVE_RUNNER is shut down when the process exits (via
System.exit() or otherwise). Move lf.lock() into the try-finally
block when saving the config file.

Delete .probe files on JVM shutdown -- they are created in daemon
threads that may terminate abruptly, not executing the "finally"
clause that normally removes these files.

Bug: 579445
Change-Id: Iaee2301eb14e6201406398a90228ad10cfea6098
2022-05-27 01:20:16 +02:00
Matthias Sohn cec6db62af Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  Stop initCause throwing in readAdvertisedRefs

Change-Id: I94251601aa7fae9cc65164eaddcf16471874b11e
2022-02-09 00:46:49 +01:00
Darius Jokilehto 78c9b9260a Stop initCause throwing in readAdvertisedRefs
BasePackConnection::readAdvertisedRefsImpl was creating an exception by
calling `noRepository`, and then blindly calling `initCause` on it. As
`noRepository` can be overridden, it's not guaranteed to be missing a
cause.

BasePackPushConnection overrides `noRepository` and initiates a fetch,
which may throw a `NoRemoteRepositoryException` with a cause.

In this case calling `initCause` threw an `IllegalStateException`.

In order to throw the correct exception, we now return the
BasePackPushConnection exception and suppress the one thrown by
BasePackConnection

Bug: 578511
Change-Id: Ic1018b214be1e83d895979ee6c7cbce3f6765f6f
2022-02-08 09:52:03 +00:00
Antonio Barone 788f439c0e Fix warning: The value of the parameter otp is not used
Silence warning by removing unused argument to the beginCopyAsIs()
method.

Change-Id: I94e7ff1c61cf8b03752de2974baa24b9c061c163
2022-01-20 12:37:16 +01:00
Matthias Sohn 9b28f43cf1 Merge changes I6a22f37f,I092389e4,I20af1d8d,I83332efc into stable-6.0
* changes:
  [bazel] Skip ConfigTest#testCommitTemplatePathInHomeDirecory
  [errorprone] Fix InfiniteRecursion error in RecordingLogger
  [errorprone] Suppress Finally error in ObjectDownloadListener
  [errorprone] Fix implicit use of default charset in FileBasedConfigTest
2022-01-19 03:50:59 -05:00
Matthias Sohn b55c224ef3 Merge "[errorprone] Suppress FutureReturnValueIgnored in FileRepository#autoGc" into stable-6.0 2022-01-19 03:43:27 -05:00
Matthias Sohn de1abd3237 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  UploadPack v2 protocol: Stop negotiation for orphan refs

Change-Id: I6a9ed8338ffbf5363e48d640a2c4209e4e503549
2022-01-18 18:07:59 +01:00
Matthias Sohn 2cc0009737 Merge branch 'stable-5.12' into stable-5.13
* stable-5.12:
  UploadPack v2 protocol: Stop negotiation for orphan refs

Change-Id: Ib43068c32d9cb8effe4b873396391dc3c9197a6e
2022-01-18 17:51:14 +01:00
Matthias Sohn 1e59cabc08 Merge branch 'stable-5.11' into stable-5.12
* stable-5.11:
  UploadPack v2 protocol: Stop negotiation for orphan refs

Change-Id: I5db432bd416cfa8d3dd295bdce63e31d5f160a8a
2022-01-18 17:49:03 +01:00
Matthias Sohn b33133497f [bazel] Skip ConfigTest#testCommitTemplatePathInHomeDirecory
Move this test to another class and skip it when running tests with
bazel since the bazel test runner does not allow to create files in the
home directory.

FS#userHome retrieves the home directory on the first call and caches it
for subsequent calls to avoid overhead in case path translation is
required (currently on cygwin). This prevents that the test can mock the
home directory using MockSystemReader like SshTestHarness does.

Change-Id: I6a22f37f4a19eb4b4935509eae508a23e56db7aa
2022-01-18 16:11:58 +01:00
Matthias Sohn a268faf729 [errorprone] Fix InfiniteRecursion error in RecordingLogger
Change-Id: I092389e428232a4fe7613d846c288d285ae9102c
2022-01-18 16:11:57 +01:00
Matthias Sohn 6fde4d3a6c [errorprone] Suppress Finally error in ObjectDownloadListener
Change-Id: I20af1d8d931608e93fbc52e127f1b7bafd2f917c
2022-01-18 16:11:57 +01:00
Matthias Sohn 8b00cb9324 [errorprone] Fix implicit use of default charset in FileBasedConfigTest
Change-Id: I83332efc498a5bce242915a1eec2346e6e1f58fd
2022-01-18 16:11:56 +01:00
Matthias Sohn 2fed62528a [errorprone] Suppress FutureReturnValueIgnored in FileRepository#autoGc
Ignore the FutureReturnValueIgnored warning for the unused return value
of #gc.

Change-Id: I4e7a2f85d404962c01726f9a1d079fe4a6430a1b
2022-01-18 16:11:50 +01:00