Commit Graph

27 Commits

Author SHA1 Message Date
Matthias Sohn 38344badf4 Document GIT_TRACE_PERFORMANCE to show timings
Change-Id: I5a39b072c50e64a2d940680ed85866edfe9d0d28
2023-11-08 02:49:23 +01:00
Matthias Sohn 8e9eab7990 config-options.md: fix sort order
Change-Id: Idf233e7b6ca41de9460b41b3d24c84f9e85472d6
2023-11-08 02:41:28 +01:00
Matthias Sohn e4779dab99 Merge branch 'stable-6.7'
* stable-6.7:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I7e0856a5d70d5d155cf6874383ea1f5622d5238a
2023-10-13 21:31:00 +02:00
Matthias Sohn 01dde5c767 Merge branch 'stable-6.6' into stable-6.7
* stable-6.6:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I29241619e6c09933bb856e486f379be10dd609c2
2023-10-13 09:06:21 +02:00
Matthias Sohn b6098c549d Merge branch 'stable-6.5' into stable-6.6
* stable-6.5:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I7272a22451c0de6b4770767e7bb4e24c81518c20
2023-10-13 08:50:48 +02:00
Matthias Sohn 626264a12d Merge branch 'stable-6.4' into stable-6.5
* stable-6.4:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I2951d01f5f4581bee20079508cd8ee6ca8554f1f
2023-10-13 02:07:39 +02:00
Matthias Sohn 1175f14c1d Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I0c9c0b3c206cac03a93b30eda348177a4de35c36
2023-10-13 00:43:31 +02:00
Antonio Barone f103a1d5c6 Add support for git config repack.packKeptObjects
Change Ide3445e652 introduced the `--pack-kept-objects` option to GC for
including the objects contained in the locked packfiles during the
repack phase.

Whilst this allowed to explicitly pass a command line argument to the
jgit gc program, it did not allow the option to be read from
configuration.

Allow the pack kept objects option to be configured exactly as C-Git
documents [1], by introducing a new `repack.packKeptObjects`
configuration.

`repack.packKeptObjects` defaults to `true`, when the
`pack.buildBitmaps` is `true` (which is the default case), `false`
otherwise.

[1] https://git-scm.com/docs/git-config#Documentation/git-config.txt-repackpackKeptObjects

Bug: 582292
Change-Id: Ia931667277410d71bc079d27c097a57094299840
2023-10-12 22:51:14 +02:00
Ronald Bhuleskar e0bd4882fb Documentation: Move writeChangedPaths flag from commitGraph to gc section.
The flag commitGraph.writeChangedPaths is not correct. The option was introduced in commit 3b77e33 (Jul 19th) under gc section so updating the documentation to reflect it under correct section.

Change-Id: Ie8f82931594ed3fd1ee3e4532d2e7475878a33a1
2023-09-19 13:33:37 -07:00
Matthias Sohn fb51a2234b Document commit-graph options supported by JGit
Change-Id: I0ab1b826232bbfcf28518d7a01ae5c5d82a08e04
2023-09-01 10:43:21 +02:00
Martin Fick e7a09e316d Introduce core.packedIndexGitUseStrongRefs config key
Introduce a core.packedIndexGitUseStrongRefs configuration key, which
defaults to true so that the current behavior does not change. However,
setting it to false allows soft references to be used for Pack indices
instead of strong references so that they can be garbage collected when
there is memory pressure.

Pack objects can be large when associated with pack files with large
object counts, and this memory is not really accounted for or tracked by
the WindowCache and it can be very substantial at times, especially with
many large object count projects. A particularly problematic use case is
Gerrit's ls-projects command which loads very little data in the
WindowCache via ByteWindows, but ends up loading and holding many entire
indices in memory, sometimes even after the ByteWindows for their Pack
objects have already been garbage collected since they won't get cleared
until after a new ByteWindow is loaded. By using SoftReferences, single
use indices can get cleared when there is memory pressure and OOMs can
be easily avoided, drastically reducing the amount of memory required to
perform an ls-projects on large sites with many projects and large
object counts.

On one of our test sites, an ls-projects command with strong index
references requires more than 66GB of heap to complete successfully,
with soft index references it requires less than 23GB.

Change-Id: I3cb3df52f4ce1b8c554d378807218f199077d80b
Signed-off-by: Martin Fick <quic_mfick@quicinc.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2023-08-26 16:16:43 +02:00
Ronald Bhuleskar 738dacb7fb BasePackFetchConnection: support negotiationTip feature
By default, Git will report, to the server, commits reachable from all local refs to find common commits in an attempt to reduce the size of the to-be-received packfile. If specified with negotiation tip, Git will only report commits reachable from the given tips. This is useful to speed up fetches when the user knows which local ref is likely to have commits in common with the upstream ref being fetched.

When negotation-tip is on, use the wanted refs instead of all refs as source of the "have" list to send.

This is controlled by the `fetch.usenegotationtip` flag, false by default. This works only for programmatic fetches and there is no support for it yet in the CLI.

Change-Id: I19f8fe48889bfe0ece7cdf78019b678ede5c6a32
2023-03-28 14:29:54 -07:00
Matthias Sohn fe64445c11 Merge branch 'stable-6.4'
* stable-6.4:
  Fix getPackedRefs to not throw NoSuchFileException
  Add pack options to preserve and prune old pack files
  Allow to perform PackedBatchRefUpdate without locking loose refs
  Document option "core.sha1Implementation" introduced in 59029aec

Change-Id: I36051c623fcd480aa80ed32b4e89f9bdd1b798e0
2023-02-20 21:29:30 +01:00
Ivan Frade 596c445af2 PackConfig: add entry for minimum size to index
The object size index can have up to #(blobs-in-repo) entries, taking
a relevant amount of memory. Let operators configure the threshold size
to include objects in the size index.

The index will include objects with size *at or above* this
value (with -1 for none). This is more effective for the
filter-by-size case.

Lowering the threshold adds more objects to the index. This improves
performance at the cost of memory/storage space. For the object-size
case, more calls will use the index instead of reading IO. For the
filter-by-size case, lower threshold means better granularity (if
ObjectReader#isSmallerThan is implemented based only on the index).

Change-Id: I6ccd9334adbbc2abf95fde51dbbfc85b8230ade0
2023-02-16 10:25:44 -08:00
Matthias Sohn 07a9eb06ff Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
  Add pack options to preserve and prune old pack files
  Allow to perform PackedBatchRefUpdate without locking loose refs
  Document option "core.sha1Implementation" introduced in 59029aec

Change-Id: I876a38c2de8b7d5eaacd00e36b85599f88173221
2023-02-16 16:59:09 +01:00
Matthias Sohn 9424052f27 Add pack options to preserve and prune old pack files
Add the options
- pack.preserveOldPacks
- pack.prunePreserved

This allows to configure in git config if old packs should be preserved
during gc and pruned during the next gc.

The original implementation in 91132bb0 only allows to set these options
using the API.

Change-Id: I5b23ab4f317d12f5ccd234401419913e8263cc9a
2023-02-11 01:19:28 +01:00
Matthias Sohn e55bad514b Document option "core.sha1Implementation" introduced in 59029aec
Bug: 580310
Change-Id: I10f3d6f6b5af7ab96683994c9cbd85e6c18a5084
2023-02-02 21:18:43 +01:00
Matthias Sohn b5de5ccb9e Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
  Shortcut during git fetch for avoiding looping through all local refs
  FetchCommand: fix fetchSubmodules to work on a Ref to a blob
  Silence API warnings introduced by I466dcde6
  Allow the exclusions of refs prefixes from bitmap
  PackWriterBitmapPreparer: do not include annotated tags in bitmap
  BatchingProgressMonitor: avoid int overflow when computing percentage
  Speedup GC listing objects referenced from reflogs
  FileSnapshotTest: Add more MISSING_FILE coverage

Change-Id: Ib5055f2f3b8a313c178d6f6c7c5630285ad5a726
2023-02-01 00:41:52 +01:00
Luca Milanesio ad977f1572 Allow the exclusions of refs prefixes from bitmap
When running a GC.repack() against a repository with over one
thousands of refs/heads and tens of millions of ObjectIds,
the calculation of all bitmaps associated with all the refs
would result in an unreasonable big file that would take up to
several hours to compute.

Test scenario: repo with 2500 heads / 10M obj Intel Xeon E5-2680 2.5GHz
Before this change: 20 mins
After this change and 2300 heads excluded: 10 mins (90s for bitmap)

Having such a large bitmap file is also slow in the runtime
processing and have negligible or even negative benefits, because
the time lost in reading and decompressing the bitmap in memory
would not be compensated by the time saved by using it.

It is key to preserve the bitmaps for those refs that are mostly
used in clone/fetch and give the ability to exlude some refs
prefixes that are known to be less frequently accessed, even
though they may actually be actively written.

Example: Gerrit sandbox branches may even be actively
used and selected automatically because its commits are very
recent, however, they may bloat the bitmap, making it ineffective.

A mono-repo with tens of thousands of developers may have
a relatively small number of active branches where the
CI/CD jobs are continuously fetching/cloning the code. However,
because Gerrit allows the use of sandbox branches, the
total number of refs/heads may be even tens to hundred
thousands.

Change-Id: I466dcde69fa008e7f7785735c977f6e150e3b644
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
2023-01-31 17:14:09 -05:00
Kaushik Lingarkar fed1a54935 Refresh 'objects' dir and retry if a loose object is not found
A new loose object may not be immediately visible on a NFS
client if it was created on another client. Refreshing the
'objects' dir and trying again can help work around the NFS
behavior.

Here's an E2E problem that this change can help fix. Consider
a Gerrit multi-primary setup with repositories based on NFS.
Add a new patch-set to an existing change and then immediately
fetch the new patch-set of that change. If the fetch is handled
by a Gerrit primary different that the one which created the
patch-set, then we sometimes run into a MissingObjectException
that causes the fetch to fail.

Bug: 581317
Change-Id: Iccc6676c68ef13a1e8b2ff52b3eeca790a89a13d
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
2023-01-13 18:44:35 +01:00
Kaushik Lingarkar 82b5aaf7e3 Introduce core.trustPackedRefsStat config
Currently, we always read packed-refs file when 'trustFolderStat'
is false. Introduce a new config 'trustPackedRefsStat' which takes
precedence over 'trustFolderStat' when reading packed refs. Possible
values for this new config are:

* always: Trust packed-refs file attributes
* after_open: Same as 'always', but refresh the file attributes of
              packed-refs before trusting it
* never: Always read the packed-refs file
* unset: Fallback to 'trustFolderStat' to determine if the file
  attributes of packed-refs can be trusted

Folks whose repositories are on NFS and have traditionally been
setting 'trustFolderStat=false' can now get some performance improvement
with 'trustPackedRefsStat=after_open' as it refreshes the file
attributes of packed-refs (at least on some NFS clients) before
considering it.

For example, consider a repository on NFS with ~500k packed-refs. Here
are some stats which illustrate the improvement with this new config
when reading packed refs on NFS:

trustFolderStat=true trustPackedRefsStat=unset: 0.2ms
trustFolderStat=false trustPackedRefsStat=unset: 155ms
trustFolderStat=false trustPackedRefsStat=after_open: 1.5ms

Change-Id: I00da88e4cceebbcf3475be0fc0011ff65767c111
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
2023-01-05 15:52:36 +01:00
Kaushik Lingarkar 52aa9c81fc Fix documentation for core.trustFolderStat
Update documentation for core.trustFolderStat to highlight that it is
also used when reading the packed-refs file.

Change-Id: I3eac377c3a7f48493abc8ae6d0889ee70a05d24d
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
2022-12-14 12:27:34 -08:00
Fabio Ponciroli 6976a30f44 searchForReuse might impact performance in large repositories
The search for reuse phase for *all* the objects scans *all*
the packfiles, looking for the best candidate to serve back to the
client.

This can lead to an expensive operation when the number of
packfiles and objects is high.

Add parameter "pack.searchForReuseTimeout" to limit the time spent
on this search.

Change-Id: I54f5cddb6796fdc93ad9585c2ab4b44854fa6c48
2021-06-25 17:57:59 +02:00
Thomas Wolf 33a055e63b Document http options supported by JGit
Change-Id: I0af4f9991fdb4f09de25f743d1e0dca67ceaa18b
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2021-03-13 17:05:47 +01:00
Matthias Sohn 824a3c6964 Fix formatting of config option values
Change-Id: If9a4bb44c4b348cbb94127207566471105267a53
2020-10-26 11:26:42 -04:00
Matthias Sohn bd5942a206 Document options in core section supported by JGit
Change-Id: I25af04112cf219405718b5c3e8e103156fb30fa5
2020-10-26 10:54:12 -04:00
Matthias Sohn 567bf85479 Document gc and pack relevant options
Change-Id: Iab7262b25942fa8c062b979d394674635b70a284
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-04-03 11:50:27 +09:00