Commit Graph

10 Commits

Author SHA1 Message Date
Matthias Sohn b5de5ccb9e Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
  Shortcut during git fetch for avoiding looping through all local refs
  FetchCommand: fix fetchSubmodules to work on a Ref to a blob
  Silence API warnings introduced by I466dcde6
  Allow the exclusions of refs prefixes from bitmap
  PackWriterBitmapPreparer: do not include annotated tags in bitmap
  BatchingProgressMonitor: avoid int overflow when computing percentage
  Speedup GC listing objects referenced from reflogs
  FileSnapshotTest: Add more MISSING_FILE coverage

Change-Id: Ib5055f2f3b8a313c178d6f6c7c5630285ad5a726
2023-02-01 00:41:52 +01:00
Luca Milanesio ad977f1572 Allow the exclusions of refs prefixes from bitmap
When running a GC.repack() against a repository with over one
thousands of refs/heads and tens of millions of ObjectIds,
the calculation of all bitmaps associated with all the refs
would result in an unreasonable big file that would take up to
several hours to compute.

Test scenario: repo with 2500 heads / 10M obj Intel Xeon E5-2680 2.5GHz
Before this change: 20 mins
After this change and 2300 heads excluded: 10 mins (90s for bitmap)

Having such a large bitmap file is also slow in the runtime
processing and have negligible or even negative benefits, because
the time lost in reading and decompressing the bitmap in memory
would not be compensated by the time saved by using it.

It is key to preserve the bitmaps for those refs that are mostly
used in clone/fetch and give the ability to exlude some refs
prefixes that are known to be less frequently accessed, even
though they may actually be actively written.

Example: Gerrit sandbox branches may even be actively
used and selected automatically because its commits are very
recent, however, they may bloat the bitmap, making it ineffective.

A mono-repo with tens of thousands of developers may have
a relatively small number of active branches where the
CI/CD jobs are continuously fetching/cloning the code. However,
because Gerrit allows the use of sandbox branches, the
total number of refs/heads may be even tens to hundred
thousands.

Change-Id: I466dcde69fa008e7f7785735c977f6e150e3b644
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
2023-01-31 17:14:09 -05:00
Kaushik Lingarkar fed1a54935 Refresh 'objects' dir and retry if a loose object is not found
A new loose object may not be immediately visible on a NFS
client if it was created on another client. Refreshing the
'objects' dir and trying again can help work around the NFS
behavior.

Here's an E2E problem that this change can help fix. Consider
a Gerrit multi-primary setup with repositories based on NFS.
Add a new patch-set to an existing change and then immediately
fetch the new patch-set of that change. If the fetch is handled
by a Gerrit primary different that the one which created the
patch-set, then we sometimes run into a MissingObjectException
that causes the fetch to fail.

Bug: 581317
Change-Id: Iccc6676c68ef13a1e8b2ff52b3eeca790a89a13d
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
2023-01-13 18:44:35 +01:00
Kaushik Lingarkar 82b5aaf7e3 Introduce core.trustPackedRefsStat config
Currently, we always read packed-refs file when 'trustFolderStat'
is false. Introduce a new config 'trustPackedRefsStat' which takes
precedence over 'trustFolderStat' when reading packed refs. Possible
values for this new config are:

* always: Trust packed-refs file attributes
* after_open: Same as 'always', but refresh the file attributes of
              packed-refs before trusting it
* never: Always read the packed-refs file
* unset: Fallback to 'trustFolderStat' to determine if the file
  attributes of packed-refs can be trusted

Folks whose repositories are on NFS and have traditionally been
setting 'trustFolderStat=false' can now get some performance improvement
with 'trustPackedRefsStat=after_open' as it refreshes the file
attributes of packed-refs (at least on some NFS clients) before
considering it.

For example, consider a repository on NFS with ~500k packed-refs. Here
are some stats which illustrate the improvement with this new config
when reading packed refs on NFS:

trustFolderStat=true trustPackedRefsStat=unset: 0.2ms
trustFolderStat=false trustPackedRefsStat=unset: 155ms
trustFolderStat=false trustPackedRefsStat=after_open: 1.5ms

Change-Id: I00da88e4cceebbcf3475be0fc0011ff65767c111
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
2023-01-05 15:52:36 +01:00
Kaushik Lingarkar 52aa9c81fc Fix documentation for core.trustFolderStat
Update documentation for core.trustFolderStat to highlight that it is
also used when reading the packed-refs file.

Change-Id: I3eac377c3a7f48493abc8ae6d0889ee70a05d24d
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
2022-12-14 12:27:34 -08:00
Fabio Ponciroli 6976a30f44 searchForReuse might impact performance in large repositories
The search for reuse phase for *all* the objects scans *all*
the packfiles, looking for the best candidate to serve back to the
client.

This can lead to an expensive operation when the number of
packfiles and objects is high.

Add parameter "pack.searchForReuseTimeout" to limit the time spent
on this search.

Change-Id: I54f5cddb6796fdc93ad9585c2ab4b44854fa6c48
2021-06-25 17:57:59 +02:00
Thomas Wolf 33a055e63b Document http options supported by JGit
Change-Id: I0af4f9991fdb4f09de25f743d1e0dca67ceaa18b
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2021-03-13 17:05:47 +01:00
Matthias Sohn 824a3c6964 Fix formatting of config option values
Change-Id: If9a4bb44c4b348cbb94127207566471105267a53
2020-10-26 11:26:42 -04:00
Matthias Sohn bd5942a206 Document options in core section supported by JGit
Change-Id: I25af04112cf219405718b5c3e8e103156fb30fa5
2020-10-26 10:54:12 -04:00
Matthias Sohn 567bf85479 Document gc and pack relevant options
Change-Id: Iab7262b25942fa8c062b979d394674635b70a284
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-04-03 11:50:27 +09:00