Commit Graph

6498 Commits

Author SHA1 Message Date
Matthias Sohn 01dde5c767 Merge branch 'stable-6.6' into stable-6.7
* stable-6.6:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I29241619e6c09933bb856e486f379be10dd609c2
2023-10-13 09:06:21 +02:00
Matthias Sohn b6098c549d Merge branch 'stable-6.5' into stable-6.6
* stable-6.5:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I7272a22451c0de6b4770767e7bb4e24c81518c20
2023-10-13 08:50:48 +02:00
Matthias Sohn 626264a12d Merge branch 'stable-6.4' into stable-6.5
* stable-6.4:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I2951d01f5f4581bee20079508cd8ee6ca8554f1f
2023-10-13 02:07:39 +02:00
Matthias Sohn da60ac9aa6 Merge branch 'stable-6.3' into stable-6.4
* stable-6.3:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I4b94a2b79941c085fa2f62246e8e879aaa85cd3f
2023-10-13 01:33:56 +02:00
Matthias Sohn c59bf16291 Merge branch 'stable-6.2' into stable-6.3
* stable-6.2:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I22b89bf00dcef26b2096d25397aa9a57a745a92b
2023-10-13 01:31:01 +02:00
Matthias Sohn 1618c3e498 Merge branch 'stable-6.1' into stable-6.2
* stable-6.1:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: Ib4e4fe407dce334c7537bf278baa39db93aa2f09
2023-10-13 00:51:04 +02:00
Matthias Sohn 1175f14c1d Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: I0c9c0b3c206cac03a93b30eda348177a4de35c36
2023-10-13 00:43:31 +02:00
Matthias Sohn add5c14b4d Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  PackConfig: fix @since tags
  Remove unused API problem filters
  Add support for git config repack.packKeptObjects
  Do not exclude objects in locked packs from bitmap processing

Change-Id: Ifeaa4b4f0c5944d4ecd3042be429833ff72b43ed
2023-10-13 00:26:22 +02:00
Matthias Sohn 4d6671b4ce PackConfig: fix @since tags
Change-Id: Ia513f7cdbf3c197e8661720fc804984ff165fc5c
2023-10-13 00:19:12 +02:00
Matthias Sohn 244165fc56 Remove unused API problem filters
Change-Id: I9d5b96cf841478af8613667ef8574423630f8028
2023-10-13 00:18:59 +02:00
Antonio Barone f103a1d5c6 Add support for git config repack.packKeptObjects
Change Ide3445e652 introduced the `--pack-kept-objects` option to GC for
including the objects contained in the locked packfiles during the
repack phase.

Whilst this allowed to explicitly pass a command line argument to the
jgit gc program, it did not allow the option to be read from
configuration.

Allow the pack kept objects option to be configured exactly as C-Git
documents [1], by introducing a new `repack.packKeptObjects`
configuration.

`repack.packKeptObjects` defaults to `true`, when the
`pack.buildBitmaps` is `true` (which is the default case), `false`
otherwise.

[1] https://git-scm.com/docs/git-config#Documentation/git-config.txt-repackpackKeptObjects

Bug: 582292
Change-Id: Ia931667277410d71bc079d27c097a57094299840
2023-10-12 22:51:14 +02:00
Luca Milanesio f5f4bf0ad9 Do not exclude objects in locked packs from bitmap processing
Packfiles having an equivalent .keep file are associated with in-flight
pushes that haven't been completed, with potentially a set of git
objects not yet referenced by a ref.

If the Git client is not up-to-date, it may result in pushing a
packfile, generating a <packfile>.keep on the server, which
may also contain existing commits due to the lack of Git protocol
negotiation in the git-receive-pack.

The Git protocol negotiation is the phase where the client and the
server exchange the list of refs they have for trying to find a common
base and minimise the amount of objects to be transferred.

The repack phase in GC was previously skipping all objects that were
contained in all packfiles having a <packfile>.keep file associated
(aka "locked packfiles"), which did not take into consideration the
fact that excluding the existing commits would have resulted in the
generation of an invalid bitmap file.

The code for excluding the objects in the locked packfiles was written
well before the bitmap was introduced, hence could not consider a use
case that did not exist at that time.

However, when the bitmap was introduced, the exclusion of locked
packfiles was not changed, hence creating a potential problem.
The issue went unnoticed for many years because the bitmap generation
was disabled when JGit noticed any locked packfiles; however, the
bitmaps are enabled again since  Id722e68d9f , and the the issue is now
visible and is impacting the GC repack phase.

Introduce the '--pack-kept-objects' option in GC for including the
objects contained in the locked packfiles during the repack phase,
which is not an issue because of the following:

- If there are any existing commits duplicated in the packfiles
  they will be just considered once anyway because the repack doesn't
  generate duplicates in the output packfile.

- If there are any new commits that do not have any ref pointing to
  them, they will be automatically excluded from the output repacked
  packfile.

The same identical solution is adopted in the C implementation of git
in repack.c.

Because the locked packfile is not pruned, any new commits not pointed
by any refs will remain in the repository and there will not be any
accidental pruning or object loss as it is today before this change.

As a side-effect of this change, it is now potentially possible to still
have duplicate BLOBs after GC when the keep packfile contained existing
objects. However, it is way better to keep the duplication until the
next GC phase rather than omitting existing objects from repacking and,
therefore generating an invalid bitmap and incorrect packfile.

Bug: 582292
Bug: 582455
Change-Id: Ide3445e652fcf256a7912f881cb898897c99b8f8
2023-10-12 22:46:08 +02:00
Matthias Sohn bb12dd4cbd Prepare 6.7.1-SNAPSHOT builds
Change-Id: I96097ef8c6f198220f513bbc6d5f8881834a1491
2023-09-07 02:03:54 +02:00
Matthias Sohn ea02caf1e7 JGit v6.7.0.202309050840-r
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: Ibe952d97bc178adb909cdd40f48957f5b68af699
2023-09-05 14:41:09 +02:00
Matthias Sohn c024cb23d8 Remove unused API problem filters
Change-Id: If37ac92711cef94a835cfd303997a3d129d212ac
2023-09-05 14:10:30 +02:00
Matthias Sohn 7a6e852745 Merge branch 'stable-6.6' into stable-6.7
* stable-6.6:
  Prepare 6.6.2-SNAPSHOT builds
  JGit v6.6.1.202309021850-r
  Checkout: better directory handling

Change-Id: Ice82d68b2d343a5fac214807cdb369e486481aab
2023-09-03 02:16:04 +02:00
Matthias Sohn 43d6bc5ef1 Prepare 6.6.2-SNAPSHOT builds
Change-Id: Id4e2fbefc49115c7e3de26a34cfbe01ba6de25b2
2023-09-03 01:57:43 +02:00
Matthias Sohn ff08cbfe07 JGit v6.6.1.202309021850-r
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I4f173dc9d634e0c9f31305961400b6b35a0a332f
2023-09-03 00:50:37 +02:00
Thomas Wolf 9072103f3b Checkout: better directory handling
When checking out a file into the working tree ensure that all parent
directories of the file below the working tree root are actually
directories and do exist before we try to create the file.

When multiple files are to be checked out (or even a whole tree), this
may check the same directories over and over again. Asking the file
system every time for file attributes is a potentially expensive
operation. As a remedy, introduce an in-memory cache of directory
states for a particular check-out operation.

Apply the same fix also in the ResolveMerger, which may also check out
files, and also in the PatchApplier. In PatchApplier, also validate
paths.

Change-Id: Ie12864c54c9f901a2ccee7caddec73027f353111
Signed-off-by: Thomas Wolf <twolf@apache.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2023-09-03 00:16:26 +02:00
Matthias Sohn 96934c9a80 Merge branch 'master' into stable-6.7
* master:
  CommitGraphWriter: throw exception on unknown chunk

Change-Id: Iaa0c563917c4195fccd57f5e6839a37008c9b808
2023-09-02 09:40:19 +02:00
Ivan Frade 13c8dacae5 CommitGraphWriter: throw exception on unknown chunk
CommitGraphWriter first defines the chunks and then writes them. If at
write time a chunk is unknown, it is ignored. This is brittle: if
somebody adds a chunk to the header but not to the actual writing, the
commit-graph is broken and there is no error reported anywhere.

Throw exception if at write time a chunk is unknown. This can only
happen by a coding error in the writer.

Change-Id: Iade677bb6ce368b6941b75a21c622917afa3b751
2023-09-01 11:39:40 -07:00
Matthias Sohn a40abb0a19 Fix warning raised for local variable hiding DfsPackFile#index
Change-Id: I45cd3be942f798d51af1e024ceb3f4d26c7af324
2023-08-31 15:13:34 +02:00
Matthias Sohn 4f3db912ce Suppress boxing warnings in DfsPackFile
Change-Id: I4b5a0a7ffdeaf7d7839787aa8b98ea9c72f70850
2023-08-31 15:11:50 +02:00
Matthias Sohn c5d8936c80 Prepare 6.7.0-SNAPSHOT builds
Change-Id: I49751232464e70b7d1dc3292a9f36b7a7015e44f
2023-08-30 17:46:26 +02:00
Matthias Sohn c54acc5822 JGit v6.7.0.202308301100-rc1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I712a9f6830364ed404d03f3a145c055906273544
2023-08-30 16:57:25 +02:00
Thomas Wolf a2f326b762 Handle global git config $XDG_CONFIG_HOME/git/config
C git uses this alternate fallback location if the file exists and
~/.gitconfig does not. Implement this also for JGit.

If both files exist, reading behavior is as if the XDG config was
inserted between the HOME config and the system config. Writing
behaviour is different: all changes will be applied only in the HOME
config. Updates will occur in the XDG config only if the HOME config
does not exist.

This is consistent with the behavior of C git; compare [1], especially
the sections on FILES and SCOPES, and the description of the --global
option.

[1] https://git-scm.com/docs/git-config

Bug: 581875
Change-Id: I2460b9aa963fd2811ed8a5b77b05107d916f2b44
Signed-off-by: Thomas Wolf <twolf@apache.org>
2023-08-28 22:05:47 +02:00
Jörg Kubitz 0d5e017612 IO: use JDK convenience methods
The benefit is that certain InputStreams can override the default
implementation for performance reasons.

Change-Id: I4c924157ec0f0ec63b0eca7cdbdc9325af24cab6
2023-08-28 21:50:14 +02:00
Martin Fick e7a09e316d Introduce core.packedIndexGitUseStrongRefs config key
Introduce a core.packedIndexGitUseStrongRefs configuration key, which
defaults to true so that the current behavior does not change. However,
setting it to false allows soft references to be used for Pack indices
instead of strong references so that they can be garbage collected when
there is memory pressure.

Pack objects can be large when associated with pack files with large
object counts, and this memory is not really accounted for or tracked by
the WindowCache and it can be very substantial at times, especially with
many large object count projects. A particularly problematic use case is
Gerrit's ls-projects command which loads very little data in the
WindowCache via ByteWindows, but ends up loading and holding many entire
indices in memory, sometimes even after the ByteWindows for their Pack
objects have already been garbage collected since they won't get cleared
until after a new ByteWindow is loaded. By using SoftReferences, single
use indices can get cleared when there is memory pressure and OOMs can
be easily avoided, drastically reducing the amount of memory required to
perform an ls-projects on large sites with many projects and large
object counts.

On one of our test sites, an ls-projects command with strong index
references requires more than 66GB of heap to complete successfully,
with soft index references it requires less than 23GB.

Change-Id: I3cb3df52f4ce1b8c554d378807218f199077d80b
Signed-off-by: Martin Fick <quic_mfick@quicinc.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2023-08-26 16:16:43 +02:00
Jonathan Tan f8f5979aa2 Merge "DfsGarbageCollector: provide commit graph stats" 2023-08-21 13:07:51 -04:00
Ivan Frade 9919a9faaf DfsReader: Make PackLoadListener interface visible to subclasses
A subclass cannot implement a listener with the default access.

Make the interface protected. Not public because so far only
subclasses are interested in this interface. We can widen the
visibility later if needed.

Change-Id: I54e5c0ef1312dfe2fa660bc8fb54e2be35c0f6df
2023-08-18 11:43:10 -07:00
Jonathan Tan 551ca93cc6 DfsGarbageCollector: provide commit graph stats
Provide commit graph stats in the same way that we provide reftable
stats.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Change-Id: Ib80c892a26f9b552bc90f3cbe7da83b02ffebdfd
2023-08-17 15:41:02 -07:00
Ivan Frade 6f73336939 DfsGarbageCollector: put only GC commits into the commit graph
GC puts all commits reachable from heads and tags into the GC pack,
and commits reachable only from other refs (e.g. refs/changes) into
GC_REST. The commit-graph contains all commits in GC and GC_REST. This
produces too big commit graphs in some repos, beating the purpose of
loading the index.

Limit the commit graph to commits reachable from heads and tags
(i.e. commits in the GC pack).

Change-Id: I4962faea5a726d2ea3e548af0aeae370a6cc8588
2023-08-16 13:31:55 -07:00
Ivan Frade b4b8f05eea DfsReader: Expose when indices are loaded
We want to measure the data used to serve a request. As a first step,
we want to know how many indices are accessed during the request and
their sizes.

Expose an interface in DfsReader to announce when an index is loaded
into the reader, i.e. when its reference is set.

The interface is more flexible to implementors (what/how to collect)
than the existing DfsReaderIOStats object.

Change-Id: I56f7658fde1758efaf869fa779d11b533a81a0a7
2023-08-03 23:53:13 +02:00
Matthias Sohn abe155ea94 Merge branch 'stable-6.6' into stable-6.7
* stable-6.6:
  Update to Tycho 4.0.1
  Add verification in GcKeepFilesTest that bitmaps are generated
  Express the explicit intention of creating bitmaps in GC
  GC: prune all packfiles after the loosen phase
  Prepare 5.13.3-SNAPSHOT builds
  JGit v5.13.2.202306221912-r

Change-Id: I7294c21748897eb3f94eeffbda944b62e3206c0d
2023-08-03 10:17:22 +02:00
Matthias Sohn b4c3a5da0d Merge branch 'stable-6.5' into stable-6.6
* stable-6.5:
  Add verification in GcKeepFilesTest that bitmaps are generated
  Express the explicit intention of creating bitmaps in GC
  GC: prune all packfiles after the loosen phase
  Prepare 5.13.3-SNAPSHOT builds
  JGit v5.13.2.202306221912-r

Change-Id: Id2e49252a9dc268210c9439848e77604885371aa
2023-08-03 10:14:45 +02:00
Matthias Sohn 82e277c813 Merge branch 'stable-6.4' into stable-6.5
* stable-6.4:
  Add verification in GcKeepFilesTest that bitmaps are generated
  Express the explicit intention of creating bitmaps in GC
  GC: prune all packfiles after the loosen phase
  Prepare 5.13.3-SNAPSHOT builds
  JGit v5.13.2.202306221912-r

Change-Id: Idb6dd6160e023673e3650653a15f6b1c540de96e
2023-08-03 01:55:12 +02:00
Matthias Sohn 76dfbb2ccd Merge branch 'stable-6.3' into stable-6.4
* stable-6.3:
  Add verification in GcKeepFilesTest that bitmaps are generated
  Express the explicit intention of creating bitmaps in GC
  GC: prune all packfiles after the loosen phase
  Prepare 5.13.3-SNAPSHOT builds
  JGit v5.13.2.202306221912-r

Change-Id: I0bccc36d9cc9a36f1be9b1562df35ce3a0e95eee
2023-08-03 01:51:36 +02:00
Matthias Sohn 05ded4ee62 Merge branch 'stable-6.2' into stable-6.3
* stable-6.2:
  Add verification in GcKeepFilesTest that bitmaps are generated
  Express the explicit intention of creating bitmaps in GC
  GC: prune all packfiles after the loosen phase
  Prepare 5.13.3-SNAPSHOT builds
  JGit v5.13.2.202306221912-r

Change-Id: I589ed444b5cbfc5b073cac91323e2cc97ab98087
2023-08-03 01:37:43 +02:00
Matthias Sohn 6483c7d209 Merge branch 'stable-6.1' into stable-6.2
* stable-6.1:
  Add verification in GcKeepFilesTest that bitmaps are generated
  Express the explicit intention of creating bitmaps in GC
  GC: prune all packfiles after the loosen phase
  Prepare 5.13.3-SNAPSHOT builds
  JGit v5.13.2.202306221912-r

Change-Id: I5b16c3b613a95b7f28c8f6ac0b20c4c593759cea
2023-08-03 01:28:07 +02:00
Matthias Sohn 55ff4ed9de Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
  Add verification in GcKeepFilesTest that bitmaps are generated
  Express the explicit intention of creating bitmaps in GC
  GC: prune all packfiles after the loosen phase
  Prepare 5.13.3-SNAPSHOT builds
  JGit v5.13.2.202306221912-r

Change-Id: Ib08037f6055dac1776e38cfb4ff8c88a50ad3e60
2023-08-03 01:19:21 +02:00
Matthias Sohn c7849fbb19 Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
  Add verification in GcKeepFilesTest that bitmaps are generated
  Express the explicit intention of creating bitmaps in GC
  GC: prune all packfiles after the loosen phase
  Prepare 5.13.3-SNAPSHOT builds
  JGit v5.13.2.202306221912-r

Change-Id: I1f50995d9d9c592ec0e02a04e0e409440b49f9f3
2023-08-03 01:17:17 +02:00
Matthias Sohn de7b5b7b26 Prepare 6.7.0-SNAPSHOT builds
Change-Id: I936d2d9106a1e3b7a98ec89fec8ae8a92ec765f2
2023-08-03 00:05:50 +02:00
Matthias Sohn 1d26471c16 JGit v6.7.0.202308011830-m2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I255a979e9f48f60a251ef7b74ced3f720f012706
2023-08-02 00:30:01 +02:00
Matthias Sohn 24b6a35d30 Add missing @since tags
This was missed in c353645a09

Change-Id: I4ae5b13bd7bfd09c113d91ece727a26706660826
2023-08-02 00:19:02 +02:00
Han-Wen NIenhuys d96a91e77e Merge "Merge: Add diff3 style merge conflict formatter." 2023-08-01 13:08:34 -04:00
Nitzan Gur-Furman c353645a09 Move footer-line parsing methods from RevCommit to FooterLine
This allows extracting footers from a messages not associated with a
commit.

The public API of RevCommit is kept intact.

Change-Id: I5809c23df7b7d49641a4be3a26d6f987d3d57c9b
Bug: Google b/287891316
2023-08-01 10:37:24 +02:00
Haamed Gheibi 462c57ec8d Merge: Add diff3 style merge conflict formatter.
Add base section to the merge conflict hunks.

Bug: 442284
Change-Id: I977b43e7dd8119d6b72d11f09c4e8ec241750383
2023-07-31 11:57:28 -07:00
Jonathan Tan ec11129b1d Merge changes I8c60d970,I09bdd4b8,I87ff3933
* changes:
  Pack: open reverse index from file if present
  PackReverseIndex: open file if present otherwise compute
  PackReverseIndex: verify checksums
2023-07-26 16:39:13 -04:00
Ivan Frade c8f5a3f99d RevCommitCG: Read changed-path-filters directly from commit graph
RevCommit and RevCommitCG were designed like "pointers" to data that
load the content on demand, not on construction. This saves memory.

Make the loading of changed-path-filter follow the same pattern. The
ChangedPathFilters are only pointers to locations in the commit-graph
(not the actual data), so the memory saving is not that big, but this
is more consistent with the rest of the API.

As 6.7 is not released, we can still change the RevWalk API.

Change-Id: Id4186ea744b8a2418d0329facae69f785108d356
2023-07-26 12:22:01 +02:00
Matthias Sohn eecd93714b Update commons-codec to 1.16.0
Change-Id: I64617b17a168da1966b93c283c150d549477f3e1
2023-07-25 23:22:46 +02:00