The reverse index for a pack is still always computed if needed, which
is slower than parsing it from a file.
Supply the file path where the reverse index file might be so that it
parsed instead of computed if the file is present.
Change-Id: I8c60d970fd587341dfb2763fb87f1c586279f2a5
Signed-off-by: Anna Papitto <annapapitto@google.com>
The reverse index for a pack is used to quickly find an object's
position in the pack's forward index based on that object's pack offset.
It is currently computed from the forward index by sorting the index
entries by the corresponding pack offset. This computation uses
insertion sort, which has an average runtime of O(n^2).
Cgit persists a pack reverse index file
to avoid recomputing the reverse index ordering. Instead they write a
file with format
https://git-scm.com/docs/pack-format#_pack_rev_files_have_the_format
which can later be read and parsed into the in-memory reverse index
each time it is needed.
PackReverseIndexV1 parses a reverse index file with the official
version 1 format into an in-memory representation of the reverse index
which implements methods to find an object's forward index position
from its offset in logorithmic time.
Change-Id: I60a92463fbd6a8cc9c1c7451df1c14d0a21a0f64
Signed-off-by: Anna Papitto <annapapitto@google.com>
The existing #read and #computeFromIndex static builder methods require
the caller to choose whether to supply an input stream of a reverse
index file or a forward index to compute the reverse index from, which
is slower.
Allow a caller to provide a file path where the pack's reverse index
might be and the pack's forward index index and simply get some reverse
index instance back. Prefer opening and parsing the file if it is
present, to save computation time. Otherwise, fall back onto computing
the reverse index from the pack's forward index.
Change-Id: I09bdd4b813ad62c86add586417b2ab86e9331aec
Signed-off-by: Anna Papitto <annapapitto@google.com>
The new version 1 file-based reverse index has a footer with the
checksum of the corresponding pack file and a checksum of its own
contents. The initial implementation doesn't enforce that the pack
checksum matches the checksum found in the forward index nor that the
self checksum matches the contents of the file just read in.
Offer a method for reverse index users to verify the checksums in a way
appropriate to the version being used. For the pre-existing computed
version, always succeed since it is not based on a file so there is no
possibility of corruption.
Check for corruption of the file itself during parsing the checksum
footer, by comparing the self checksum with the digest of the file
contents read.
Change-Id: I87ff3933cf1afa76663350400b616695e4966cb6
Signed-off-by: Anna Papitto <annapapitto@google.com>
The ComputedPackReverseIndex uses a custom sorting algorithm, based on
bucket sort with insertion sort but with the data managed as a linked
list across two int arrays. This custom algorithm relies on the set of
values being sorted being exactly 0, ..., n-1; so that they can serve a
second purpose of being indexes into a second equally sized list.
This custom algorithm was introduced ~10 years ago in
6cc532a43c.
The original author is no longer an active contributor, so it is
valuable for the code to be readable, especially as there is currently
active work on reverse indexes.
Rename variables and add comments to clarify the algorithm and improve
readability. There are no functional changes to the algorithm.
Change-Id: Ic3b682203f20e06f9f865f81259e034230f9720a
Signed-off-by: Anna Papitto <annapapitto@google.com>
Currently, bloom filters are written and used without any way to turn
them off. Add a per-repo config variable to control whether bloom
filters are written. As for reading, add a JGit option to control this.
(A JGit option is used instead of a per-repo config variable as there is
usually no reason not to use the bloom filters if they are present, but
a global control to disable them is useful if there turns out to be an
issue with the implementation of bloom filters.)
The config that controls reading is the same as C Git, but the config
for writing is not: C Git has no config to control writing, but whether
bloom filters are written depends on whether bloom filters are already
present and what arguments are passed to "git commit-graph write". See
the manpage of "git commit-graph" for more information.
Change-Id: I1b7b25340387673506252b9260b22bfe147bde58
Teach CommitGraphWriter to reuse changed path filters that have been
read from the commit graph file whenever possible.
Change-Id: I1acbfa1613ca7198386a49209028886af360ddb6
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Teach RevWalk, TreeRevFilter, PathFilter, and FollowFilter to use
changed path filters, whenever available, to speed revision walks by
skipping commits that fail the changed path filter.
This work is based on earlier work by Kyle Zhao
(I441be984b609669cff77617ecfc838b080ce0816).
Change-Id: I7396f70241e571c63aabe337f6de1b8b9800f7ed
Signed-off-by: kylezhao <kylezhao@tencent.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
As described in the parent commit, add support for reading the BIDX and
BDAT chunks of the commit graph file, as described in man gitformat-
commit-graph(5).
This work is based on earlier work by Kyle Zhao
(I160f6b022afaa842c331fb9a086974e49dced7b2).
Change-Id: I82e02e6a3a3b758e6bf9d7bbd2198f0ffe3a331b
Signed-off-by: kylezhao <kylezhao@tencent.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Add support for writing the BIDX and BDAT chunks of the commit graph
file, as described in man gitformat-commit-graph(5). The ability to read
such chunks will be added in a subsequent commit.
This work is based on earlier work by Kyle Zhao
(Ib863782af209f26381e3ca0a2c119b99e84b679c).
Change-Id: Ic18e6f0eeec7da1e1ff31751aabda5e6952dbe6e
Signed-off-by: kylezhao <kylezhao@tencent.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Support PKCS#11 HSMs (like YubiKey PIV) for SSH authentication.
Use the SunPKCS11 provider as described at [1]. This provider
dynamically loads the library from the PKCS11Provider SSH configuration
and creates a Java KeyStore with that provider. A Java CallbackHandler
is needed to feed PIN prompts from the KeyStore into the JGit
CredentialsProvider framework. Because the JGit CredentialsProvider may
be specific to a SSH session but the PKCS11Provider may be used by
several sessions, the CallbackHandler needs to be configurable per
session.
PIN prompts respect the NumberOfPasswordPrompts SSH configuration. As
long as the library asks only for a PIN, we use the KeyPasswordProvider
to prompt for it. This gives automatic integration in Eclipse with the
Eclipse secure storage, so a user has even the option to store the PIN
there. (Eclipse will then ask for the secure storage master password on
first access, so the usefulness of this is debatable.)
By default the provider uses the first PKCS#11 token (slot list index
zero). This can be overridden by a non-standard PKCS11SlotListIndex
ssh configuration entry. (For OpenSSH interoperability, also set
"IgnoreUnknown PKCS11SlotListIndex" in the SSH config file then.)
Once loaded, the provider and its shared library and the keys
contained remain available until the application exits.
Manually tested using SoftHSM. See file manual_tests.txt. Kudos to
Christopher Lamb for additional manual testing with a real YubiKey,
also on Windows.[2]
[1] https://docs.oracle.com/en/java/javase/11/security/pkcs11-reference-guide1.html
[2] https://www.eclipse.org/forums/index.php/t/1113295/
Change-Id: I544c97e1e24d05e28a9f0e803fd4b9151a76ed11
Signed-off-by: Thomas Wolf <twolf@apache.org>
RefTree was packed in its own packfile, see
Icbb735be8fa91ccbf0708ca3a219b364e11a6b83.
RefTree was deleted in Ia3da7f2b82d9e365cec2ccf9397cbc47439cd150, since
it was experimental and never used productively. This change missed to
remove the extra pack handling for RefTree.
Change-Id: I8c0d0a66440c331c3d03d0e07d5629682af2a7a9
The DfsInserter writes the pack and its indices in the flush() method,
but when the writing happens via DfsPackParser, it is the parser which
writes the pack and indices. When combined with a parser, flushing the
inserter is a noop.
Add the writing of the object size index to the packparser#parse
method, mirroring how the primary index is written.
Change-Id: I52c5db153fea7e4a8ecd8b3d5de7ad21f7f81a60
DfsInserter receives objects and on flush() writes a pack and its
primary index.
Teach the DfsInserter to write also the object size index if the
config says so.
Change-Id: I89308312f8fd898d4c714a9b68ff948d3663800b
We need the full size of the object to populate the object size index
later.
Save the size the PackedObjectInfo while adding objects to the
pack. Then we don't need to re-read it from the pack at indexing time.
Change-Id: I5bd7ad402df60b4637038def8ef7be2ab45faf87
PackWriter knows how to add an object size index to the pack, but the
garbage collector is not using it yet.
Teach DfsGarbageCollector to write the object size index on
writePack(). Disable by default in the unreachable-garbage pack.
Callers control the content/presence of the index through the
PackConfig option (minBytesForObjSizeIndex) for all other packs, so
there is no need of a specific flag in DfsGarbageCollector.
Change-Id: I86f5f17310e6913381125bec4caab32dc45b7c9d
isNotLargerThan() can avoid reading the size of a blob from disk using
the object size idx if available.
Load the object size index in the DfsPackfile following the same
pattern than the other indices. Override isNotLargerThan in DfsReader
to use the index when available.
Following CL introduces the writing of the object size index and the
tests cover this code.
Change-Id: I15c95b84c1424707c487a7d29c5c46b1a9d0ceba
Add an explicit flag to PackWriter for allowing the
GC.repack() phase to explicitly generate bitmaps only for the
heads packfile and not for the others.
Previously the bitmap generation was conditioned to the
presence of object ids exclusion from the PackWriter.
The introduction of the bitmap generation in the PackWriter
done in Icdb0cdd66 has accidentally made the .keep files not
completely transparent, because their presence have disabled
the generation of the bitmap index, even if the generation
of bitmaps is enabled.
This bug has been an accidental consequence of the intention
of the bitmap generator to avoid generating bitmaps for the
non-heads packfile, however the implementation done by Colby
decided to use the excludeInPacks variable (see [1]) which
is unfortunately also used for excluding the packfiles having
an associated .keep file (see [2]).
[1] https://git.eclipse.org/r/c/jgit/jgit/+/7940/18/org.eclipse.jgit/src/org/eclipse/jgit/storage/pack/PackWriter.java#1617
[2] dafcb8f6db/org.eclipse.jgit/src/org/eclipse/jgit/storage/file/GC.java (506)
Bug: 582039
Change-Id: Id722e68d9ff4ac24e73bf765ab11017586b6766e
When loosening the objects inside the packfiles to be pruned, make sure
that the packfile list is stable and prune all the files after the
loosening is done.
This prevents a series of exceptions previously thrown when loosening
the packfiles, due to the too early pruning of the packfiles that were
still in the pack list.
Bug: 581532
Change-Id: I776776e2e083f1fa749d53f965bf50f919823b4f
The DfsPackFile#getReverseIdx method, which wraps creating a
PackReverseIndex in caching, was package-private. This caused
implementations on top of DfsPackFile to directly instantiate a
PackReverseIndex in cases where it would benefit from caching.
Instead, make #getReverseIdx public so that the caching logic can be
reused by implementations where appropriate.
Change-Id: I4553e514a4ac320bfe2455c00023343ad97f9d15
Signed-off-by: Anna Papitto <annapapitto@google.com>
PackReverseIndex is a concrete class whose implementation is computed
from a pack's forward index. Callers which have a reverse index file may
want to use an implementation that is file-based instead.
Generalize PackReverseIndex into an interface without
implementation-specific logic and separate out the logic for the
computed implementation into a new concrete class.
Change-Id: I98d9835363c5e1c8c3c11a81b0761af3cdeaa41a
Signed-off-by: Anna Papitto <annapapitto@google.com>
C git has a default for git config core.excludesfile: "Its default
value is $XDG_CONFIG_HOME/git/ignore. If $XDG_CONFIG_HOME is either
not set or empty, $HOME/.config/git/ignore is used instead." [1]
Implement this in the WorkingTreeIterator$RootIgnoreNode.
To make this testable, mock the "user.home" directory for all JGit
tests, otherwise tests might pick up a real user's git ignore file.
Also ensure that JGit code always reads "user.home" via the
SystemReader.
Add tests for both locations.
[1] https://git-scm.com/docs/gitignore#_description
Bug: 436127
Change-Id: Ie510259320286c3c13a6464a37da1bd9ca1e373a
Signed-off-by: Thomas Wolf <twolf@apache.org>
This fixes all the javadoc warnings, stops ignoring doclint 'missing'
category and fails the build on javadoc warnings for public and
protected classes and class members.
Since javadoc doesn't allow access specifiers when specifying doclint
configuration we cannot set `-Xdoclint:all,-missing/private`
hence there is no simple way to skip private elements from doclint.
Therefore we check javadoc using the Eclipse Java compiler
(which is used by default) and javadoc configuration in
`.settings/org.eclipse.jdt.core.prefs` files.
This allows more fine grained configuration.
We can reconsider this when javadoc starts supporting access specifiers
in the doclint configuration.
Below are detailled explanations for most modifications.
@inheritDoc
===========
doclint complains about explicits `{@inheritDoc}` when the parent does
not have any documentation. As far as I can tell, javadoc defaults to
inherit comments and should only be used when one wants to append extra
documentation from the parent. Given the parent has no documentation,
remove those usages which doclint complains about.
In some case I have moved up the documentation from the concrete class
up to the abstract class.
Remove `{@inheritDoc}` on overriden methods which don't add additional
documentation since javadoc defaults to inherit javadoc of overridden
methods.
@value to @link
===============
In PackConfig, DEFAULT_SEARCH_FOR_REUSE_TIMEOUT and similar are forged
from Integer.MAX_VALUE and are thus not considered constants (I guess
cause the value would depends on the platform). Replace it with a link
to `Integer.MAX_VALUE`.
In `StringUtils.toBoolean`, @value was used to refer to the
`stringValue` parameter. I have replaced it with `{@code stringValue}`.
{@link <url>} to <a>
====================
@link does not support being given an external URL. Replaces them with
HTML `<a>`.
@since: being invalid
=====================
org.eclipse.jgit/src/org/eclipse/jgit/util/Equality.java has an invalid
tag `@since: ` due to the extra `:`. Javadoc does not complain about it
with version 11.0.18+10 but does with 11.0.19.7. It is invalid
regardless.
invalid HTML syntax
===================
- javadoc doesn't allow <br/>, <p/> and </p> anymore, use <br> and <p>
instead
- replace <tt>code</tt> by {@code code}
- <table> tags don't allow summary attribute, specify caption as
<caption>caption</caption> to fix this
doclint visibility issue
========================
In the private abstract classes `BaseDirCacheEditor` and
`BasePackConnection` links to other methods in the abstract class are
inherited in the public subclasses but doclint gets confused and
considers them unreachable. The HTML documentation for the sub classes
shows the relative links in the sub classes, so it is all correct. It
must be a bug somewhere in javadoc.
Mute those warnings with: @SuppressWarnings("doclint:missing")
Misc
====
Replace `<` and `>` with HTML encoded entities (`< and `>`).
In `SshConstants` I went enclosing a serie of -> arrows in @literal.
Additional tags
===============
Configure maven-javad0c-plugin to allow the following additional tags
defined in https://openjdk.org/jeps/8068562:
- apiNote
- implSpec
- implNote
Missing javadoc
===============
Add missing @params and descriptions
Change-Id: I840056389aa59135cfb360da0d5e40463ce35bd0
Also-By: Matthias Sohn <matthias.sohn@sap.com>
In org.eclipse.jgit.lib.Constants the constants are all marked final
with the exception of:
- COMMIT_GENERATION_UNKOWN
- COMMIT_GENERATION_NOT_COMPUTED
They were introduced by cf70e7cbe4 without the `final` keyword while
other constants have it which certainly has been forgotten.
The javadoc `{@value}` tag causes raises a warning about the fields not
being constants which is how I have discovered the ommission.
Change-Id: I0ad87f42355440c7d50158e773a280a0526e9671
* stable-6.4:
Revert "RefDirectory: Throw exception if CAS of packed ref list fails"
Change-Id: I7d922a92b7674723cbf6a93fb7c9bc5c0cdb8206
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
* stable-6.3:
Revert "RefDirectory: Throw exception if CAS of packed ref list fails"
Change-Id: I33049e70595f097a66e8f4a63b3d8d1c147e878e
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
* stable-6.2:
Revert "RefDirectory: Throw exception if CAS of packed ref list fails"
Change-Id: I70db1bc8529eb6a66610946946da5447a578bffa
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
* stable-6.1:
Revert "RefDirectory: Throw exception if CAS of packed ref list fails"
Change-Id: I1a98e293ef10917b2d8ad64e88be9e82c7bcf693
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
* stable-6.0:
Revert "RefDirectory: Throw exception if CAS of packed ref list fails"
Change-Id: Idc0d1f8ab4524868b7e9754799f70acc1d24f2cb
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
* stable-5.13:
Revert "RefDirectory: Throw exception if CAS of packed ref list fails"
Change-Id: I883b21b00317cc6d9951a8a5f9505078ddd2a3a7
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
This reverts commit 9c33f7364d.
Reason for revert: This change was based on the false claim that the
packedrefs file lock is held while the CAS is being done, but it is
actually released before the CAS (the in memory lock is still held,
however that does not prevent external actors from updating the
packedrefs files and then another thread from subsequently re-reading it
and updating the in memory packedRefList). Although reverting this
change can cause the CAS to fail, it should not actually matter since
the failure would indicate that another thread has already updated the
in memory packedRefList to either the same version this thread was
trying to update it too, or to a more recent version. Either way,
failing the CAS is then appropriate and should not be problematic.
Although this change reverts the code in the RefDirectory class, it
keeps the "improvements" to the test so that it continues to pass
reliably. The reason for the quotes around the word "improvements" is
because I believe the test alteration actually dramatically changes the
intent of the test, and that the original intent of the test is
untestable with the GC and RefDirectory classes as is.
Bug: 582044
Change-Id: I3acee7527bb542996dcdfaddfb2bdb45ec444db5
Signed-off-by: Martin Fick <quic_mfick@quicinc.com>
(cherry picked from commit c5617711a1)
PackReverseIndex instances are created using the constructor directly,
which limits control over the construction logic and refactoring
opportunities for the class itself. These will be needed for a
file-based implementation of the reverse index.
Use a static builder method to create a PackReverseIndex instance using
a pack's forward index.
Change-Id: I4421d907cd61d9ac932df5377e5e28a81679b63f
Signed-off-by: Anna Papitto <annapapitto@google.com>
The reverse index is currently created in-memory when needed. A writer
for reverse index files was already implemented.
Make garbage collection write the reverse index file when the PackConfig
enables it. Write it during #writePack, which mirrors how the primary
index is written.
Change-Id: I50131af6622c41a7b24534aaaf2a423ab4178981
Signed-off-by: Anna Papitto <annapapitto@google.com>
In findGraphPosition, when there is no object whose OID starts with
the first byte of the sought OID, low equals high. This violates an
invariant of the loop, and when the sought OID is lexicographically
greater than every other OID in the repository, causes an
ArrayIndexOutOfBoundsException (because we're trying to read outside the
list of OIDs).
Therefore, check the "low < high" condition at the start of the loop,
not only after the first iteration.
Change-Id: Ic8ac198c151bd161c4996b9e7cb6e6660f151733
Helped-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
If a method called in a finally block throws an exception we should add
exceptions caught earlier to the exception we throw in the finally block
not regarding if it's a checked or unchecked exception.
Change-Id: I4c6be9a3a08482b07659ca31d6987ce719d81ca5
Errorprone raises the following warning:
"[ReferenceEquality] Comparison using reference equality
instead of value equality".
Change-Id: Iacb207ef0625bb987a08406d4e7461e48fade97f
New static final constant is a (very minor) API break that needs to be
suppressed explicitly despite @since 6.6.
Remove a number of no longer needed API filters, and fix a broken
$NON-NLS-1$.
Change-Id: Ie4b0c45e8bd1f3067b6ff81c07d4b21b50bb8685
Signed-off-by: Thomas Wolf <twolf@apache.org>
During garbage collection, extensions for temporary files for indices
are formatted manually.
Add a method to PackExt to generate the temporary file extensions for
each type of index file programmatically.
Change-Id: I210bc2702e750bf0aea643b1a9a8536adebef179
Signed-off-by: Anna Papitto <annapapitto@google.com>
ServiceV2 is not collecting wants/have in PackStatistics. This records
the stats for fetch and push-negotiation.
Change-Id: Iefd79f36b3d7837195e8bd9fc7007de352089e66
PackWriter offers the ability to write out the pack file and its various
index files, except for the newly introduced file-based reverse index.
Now that PackReverseIndexWriter can write reverse index files,
PackWriter#writeReverseIndex will write one for a pack if the
corresponding config flag PackConfig#writeReverseIndex is on.
Change-Id: Ib75dd2bbfb9ee9366d5aacb46700d8cf8af4823a
Signed-off-by: Anna Papitto <annapapitto@google.com>
The in process lock is intended to manage contention on locking the
packed-refs file within a single process without acquiring the file
system lock. Not sharing it across RefDirectory instances of the same
repository undermines that intent and results in more contention at the
file system level.
Change-Id: I68f11856aa0b4b1524f43554d7391a322a0a6897
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
With completely independent branches, there is no merge base. In this
case, the list of commits must include the root commit of the branch to
be rebased.
Bug: 581832
Change-Id: I0f5bdf179d5b07ff09f1a274d61c7a0b1c0011c6
Signed-off-by: Thomas Wolf <twolf@apache.org>
Handle the case of the commit to be picked not having any parents.
Since JGit implements cherry-pick as a 3-way-merge between the commit
to be picked and the target commit, using the parent of the picked
commit as merge base, this is super simple: just don't set a base tree.
The merger will not find any merge base and will supply an empty tree
iterator for the base.
Bug: 581832
Change-Id: I88985f1b1723db5b35ce58bf228bc48d23d6fca3
Signed-off-by: Thomas Wolf <twolf@apache.org>
JGit's AddCommand always renormalizes tracked files. C git does so only
on git add --renormalize. Especially for git add . and the JGit
equivalent git.add().addFilepattern(".").call() this can make a big
difference if there are many files, or large files.
Add a "renormalize" option to AddCommand. To maintain compatibility with
existing uses, this option is "true" by default, and the behavior of
AddCommand is as it has always been in JGit.
If set to "false", use an IndexDiffFilter (in addition to a path filter,
if any). This skips any unchanged files (that are not racily clean) from
content checks. Note that changes in CRLF settings or in filters will be
ignored for such files if renormalize == false.
Add the "--renormalize" option to the Add command in the JGit command
line program. For the command-line program, the default is as in C git:
renormalize is off by default and enabled only if the option is given.
Note that --renormalize implies --update in the command line program, as
in C git. In AddCommand, the two settings are independent.
Additionally, avoid opening input streams unnecessarily in
WorkingTreeIterator.getEntryContentLength() and fix some bogus
indentation.
Add a simple test that adds 1000 files of 10kB in 10 directories twice
and that fails if the second invocation (without any changes) with
renormalize=false is not significantly faster.
Locally, I observe for that second invocation
* git.add().addFilepattern(".").call() ~660ms
* git.add().addFilepattern(".").setRenormalize(false).call() ~16ms
Bug: 494323
Change-Id: I30f9d518563fa55d7058a48c27c425f3b60aeb4c
Signed-off-by: Thomas Wolf <twolf@apache.org>
* stable-6.5:
[bazel] Move ToolTestCase to src folder (6.2)
GcConcurrentTest: @Ignore flaky testInterruptGc
Fix CommitTemplateConfigTest
Fix after_open config and Snapshotting RefDir tests to work with bazel
[bazel] Skip ConfigTest#testCommitTemplatePathInHomeDirecory
Demote severity of some error prone bug patterns to warnings
Parse pull.rebase=preserve as alias for pull.rebase=merges
UploadPack: Fix NPE when traversing a tag chain
Change-Id: I16e8553d187a8ef541f578291f47fc39c3da4ac0
The reverse index for a pack is used to quickly find an object's
position in the pack's forward index based on that object's pack offset.
It is currently computed from the forward index by sorting the index
entries by the corresponding pack offset. This computation uses
bucket sort with insertion sort, which has an average runtime of
O(n log n) and worst case runtime of O(n^2); and memory usage of
3*size(int)*n because it maintains 3 int arrays, even after sorting is
completed. The computation must be performed every time that the reverse
index object is created in memory.
In contrast, Cgit persists a pack reverse index file to avoid
recomputing the reverse index ordering every time that it is needed.
Instead they write a file with format
https://git-scm.com/docs/pack-format#_pack_rev_files_have_the_format
which can later be read and parsed into an in-memory reverse index each
time it is needed.
Introduce these reverse index files to JGit. PackReverseIndexWriter
writes out a reverse index file to be read later when needed. Subclass
PackReverseIndexWriterV1 writes a file with the official version 1
format.
To avoid temporarily allocating an Integer collection while sorting and
writing out the contents, using memory 4*size(Integer)*n, use an
IntList and its #sort method, which uses quicksort.
Change-Id: I6437745777a16f723e2f1cfcce4e0d94e599dcee
Signed-off-by: Anna Papitto <annapapitto@google.com>
IntList is a class for working with lists of primitive ints without
boxing them into Integers. For writing the reverse index file format,
sorting ints will be needed but IntList doesn't provide a sorting
method yet.
Add the #sort method to sort an IntList by an IntComparator, using
quicksort, which has a average runtime of O(n log n) and sorts in-place
by using O(log n) stack frames for recursive calls.
Change-Id: Id69a687c8a16d46b13b28783b194a880f3f4c437
Signed-off-by: Anna Papitto <annapapitto@google.com>
* stable-6.4:
[bazel] Move ToolTestCase to src folder (6.2)
GcConcurrentTest: @Ignore flaky testInterruptGc
Fix CommitTemplateConfigTest
Fix after_open config and Snapshotting RefDir tests to work with bazel
[bazel] Skip ConfigTest#testCommitTemplatePathInHomeDirecory
Demote severity of some error prone bug patterns to warnings
UploadPack: Fix NPE when traversing a tag chain
Change-Id: I6d20fea3a417e4361b61e81756253343668eb5de
* stable-6.3:
[bazel] Move ToolTestCase to src folder (6.2)
GcConcurrentTest: @Ignore flaky testInterruptGc
Fix CommitTemplateConfigTest
Fix after_open config and Snapshotting RefDir tests to work with bazel
[bazel] Skip ConfigTest#testCommitTemplatePathInHomeDirecory
Demote severity of some error prone bug patterns to warnings
UploadPack: Fix NPE when traversing a tag chain
Change-Id: I463f8528e623316add204848d551c44d44d04858
* stable-6.2:
[bazel] Move ToolTestCase to src folder (6.2)
GcConcurrentTest: @Ignore flaky testInterruptGc
Fix CommitTemplateConfigTest
Fix after_open config and Snapshotting RefDir tests to work with bazel
[bazel] Skip ConfigTest#testCommitTemplatePathInHomeDirecory
Demote severity of some error prone bug patterns to warnings
UploadPack: Fix NPE when traversing a tag chain
Change-Id: I736c7d0ed9c6e9718fa98976c3dc5a25ab8cda85
* stable-6.1:
GcConcurrentTest: @Ignore flaky testInterruptGc
Fix CommitTemplateConfigTest
Fix after_open config and Snapshotting RefDir tests to work with bazel
[bazel] Skip ConfigTest#testCommitTemplatePathInHomeDirecory
Demote severity of some error prone bug patterns to warnings
UploadPack: Fix NPE when traversing a tag chain
Change-Id: I9863cbce95d845efc891724898954b0b2f8dbf7b
* stable-6.0:
[bazel] Skip ConfigTest#testCommitTemplatePathInHomeDirecory
Demote severity of some error prone bug patterns to warnings
UploadPack: Fix NPE when traversing a tag chain
Change-Id: I5e13d5b5414aef97e518898166bfa166c692e60f
This ensures backwards compatibility to the old config value which was
removed in git 2.34 which JGit followed in Ic07ff954e2.
Change-Id: I2b4e27fd71898b6e0e227e406c40682bd9786cd4
Always parse RevTags including their body before getting their object
to ensure that non-cached objects are handled correctly when traversing
a tag chain. An NPE in UploadPack#addTagChain will occur on a depth=1
fetch of a branch containing a tag chain and the ref to one of the
middle tags in the chain is deleted.
Change-Id: Ifd8fe868869070b365df926fec5dcd8e64d4f521
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
* stable-6.5:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on PackMismatchException
Remove blank in maven.config
Change-Id: I0b03ca566053a158c6c8e75ccec8360a2f368ed9
* stable-6.4:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on
PackMismatchException
Remove blank in maven.config
Change-Id: I89af76946014fb44bd64c20e2b01a53397768d90
* stable-6.3:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on PackMismatchException
Remove blank in maven.config
Change-Id: I18b46be0f09535c61efabe24ab1579faa3d06ba8
* stable-6.2:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on PackMismatchException
Remove blank in maven.config
Change-Id: I8006068f16ae442a2246e043a680053f2af34e9f
* stable-6.1:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on PackMismatchException
Remove blank in maven.config
Change-Id: I4c5b000b09287cc32f0e4d6a24a766ef4e17ddbe
* stable-6.0:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on PackMismatchException
Remove blank in maven.config
Change-Id: Ia01c5ac5259b8820afb823d97bee247b5a5fb14a
* stable-5.13:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on
PackMismatchException
Remove blank in maven.config
Change-Id: Id37bee59ca3c7947604c54b6d4e7c02628a657fe
* stable-5.12:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on
PackMismatchException
Remove blank in maven.config
Change-Id: Ibe6652374ab5971105e62b05279f218c8c130fee
* stable-5.11:
Add missing since tag for SshBasicTestBase
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on PackMismatchException
Remove blank in maven.config
Change-Id: I25bb99687b969f9915a7cbda8d1332bec778096a
* stable-5.10:
Add missing since tag for SshTestHarness#publicKey2
Silence API errors
Prevent infinite loop rescanning the pack list on
PackMismatchException
Remove blank in maven.config
Migrated "Prevent infinite loop rescanning the pack list on
PackMismatchException" to refactoring done in
https://git.eclipse.org/r/q/topic:restore-preserved-packs
Change-Id: I0fb77bb9b498d48d5da88a93486b99bf8121e3bd
* stable-5.9:
Prevent infinite loop rescanning the pack list on
PackMismatchException
Remove blank in maven.config
Change-Id: I15ff2d7716ecaceb0eb87b8823d85670f5db709d
We found, when analysing an incident where Gerrit's gc runner thread got
stuck, that we can end up in an infinite loop in
ObjectDirectory#openPackedObject which tries to rescan the pack
list and starts over trying to open a packed object in an unconfined
loop if it catches a PackMismatchException.
Here the relevant part of a thread dump we created while the gc runner
was stuck:
"WorkQueue-2[java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@350812a3[Not
completed,
task = java.util.concurrent.Executors$RunnableAdapter@5425d7ee]]" #72
tid=0x00007f73cee1c800 nid=0x584
runnable [0x00007f7392d57000]
java.lang.Thread.State: RUNNABLE
at org.eclipse.jgit.internal.storage.file.WindowCache.removeAll(WindowCache.java:716)
at org.eclipse.jgit.internal.storage.file.WindowCache.purge(WindowCache.java:399)
at org.eclipse.jgit.internal.storage.file.PackFile.close(PackFile.java:296)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.reuseMap(ObjectDirectory.java:973)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.scanPacksImpl(ObjectDirectory.java:904)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.scanPacks(ObjectDirectory.java:895)
- locked <0x000000050a498f60> (a
java.util.concurrent.atomic.AtomicReference)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.searchPacksAgain(ObjectDirectory.java:794)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedObject(ObjectDirectory.java:465)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedFromSelfOrAlternate(ObjectDirectory.java:417)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObject(ObjectDirectory.java:408)
at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:132)
at org.eclipse.jgit.lib.ObjectReader$1.open(ObjectReader.java:279)
at org.eclipse.jgit.revwalk.RevWalk$2.next(RevWalk.java:1031)
at org.eclipse.jgit.internal.storage.pack.PackWriter.findObjectsToPack(PackWriter.java:1911)
at org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:960)
at org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:876)
at org.eclipse.jgit.internal.storage.file.GC.writePack(GC.java:1168)
at org.eclipse.jgit.internal.storage.file.GC.repack(GC.java:852)
at org.eclipse.jgit.internal.storage.file.GC.doGc(GC.java:269)
at org.eclipse.jgit.internal.storage.file.GC.gc(GC.java:220)
at org.eclipse.jgit.api.GarbageCollectCommand.call(GarbageCollectCommand.java:179)
at com.google.gerrit.server.git.GarbageCollection.run(GarbageCollection.java:112)
at com.google.gerrit.server.git.GarbageCollection.run(GarbageCollection.java:75)
at com.google.gerrit.server.git.GarbageCollection.run(GarbageCollection.java:71)
at com.google.gerrit.server.git.GarbageCollectionRunner.run(GarbageCollectionRunner.java:76)
at com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:103)
at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.18/Executors.java:515)
at java.util.concurrent.FutureTask.runAndReset(java.base@11.0.18/FutureTask.java:305)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11.0.18/ScheduledThreadPoolExecutor.java:305)
at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:612)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.18/ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.18/ThreadPoolExecutor.java:628)
at java.lang.Thread.run(java.base@11.0.18/Thread.java:829)
The code in ObjectDirectory#openPackedObject [1] apparently assumes that
this is caused by a transient problem which it can resume from by
retrying. We use `core.trustFolderStat = false` on this server since it
uses NFS. The incident we had showed that we can enter into an infinite
loop here if there is a permanent mismatch between a pack file and its
corresponding pack index. I am not yet sure how this can happen.
Break the infinite loop by limiting the number of attempts rescanning
the pack list to 5 retries. When we exceed this threshold set the type
of the PackMismatchException to permanent and rethrow it which breaks
the infinite loop.
Also apply the same limit in #getPackedObjectSize
and #selectObjectRepresentation where we use similar retry loops.
[1] 011c26ff36/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/ObjectDirectory.java (465)
Change-Id: I20fb63bcc1fdc3a03d39b963f06a90e6f0ba73dc
If packing loose refs fails due to a lock failure, reject update with
a LOCK_FAILURE.
Change-Id: I100e81efd528d963231a1b87bacd9d68f9245a1b
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
Before this change, attempting to use the jgit Amazon S3 transport with an S3-compatible service that requires https (for example, Backblaze B2) results in an error:
$ jgit push b2
fatal: amazon-s3://metadaddy-jgit/repos/test/objects: error in packed-refs
This change adds a "protocol" property to the Amazon S3 transport configuration, defaulting to http, and uses that value when constructing the URL for the S3 service.
Example configuration for Backblaze B2:
accesskey: <Your B2 Application Key>
secretkey: <Your B2 Application Key Id>
acl: private
protocol: https
domain: s3.us-west-004.backblazeb2.com
region: us-west-004
aws.api.signature.version: 4
Behavior after this change:
$ jgit push b2
Counting objects: 3
Finding sources: 100% (3/3)
Getting sizes: 100% (2/2)
Compressing objects: 100% (37/37)
Writing objects: 100% (3/3)
Put pack-673f9bb.idx: 100% (1/1)
To amazon-s3://.jgit_b2@metadaddy-jgit/repos/test
* [new branch] main -> main
Change-Id: I03bdbb3510fb81a416c225a720178f42eec41b21
A commit A can reach a commit B only if the generation number of A is
strictly larger than the generation number of B. This condition allows
significantly short-circuiting commit-graph walks.
On a copy of the Linux repository where HEAD is contained in v6.3-rc4
but no earlier tag, the command 'git tag --contains HEAD' of
ListTagCommand#call() had the following peformance improvement:
(excluded the startup time of the repo)
Before: 2649ms (core.commitgraph=true)
11909ms (core.commitgraph=false)
After: 91ms (core.commitgraph=true)
11934ms (core.commitgraph=false)
Bug: 574368
Change-Id: Ia2efaa4e9ae598266f72e70eb7e3b27655cbf85b
Signed-off-by: kylezhao <kylezhao@tencent.com>