Add streams that can encode or decode git binary patch data on the fly.
Git writes binary patches base-85 encoded, at most 52 un-encoded bytes,
with the unencoded data length prefixed in a one-character encoding, and
suffixed with a newline character.
Add a test for both the new input and the output stream. The test
roundtrips binary data of different lengths in different ways.
Bug: 371725
Change-Id: Ic3faebaa4637520f5448b3d1acd78d5aaab3907a
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Add an implementation for base-85 encoding and decoding [1]. Git binary
patches use this format.
Base-85 encoding assembles bytes as 32-bit MSB values, then converts
these values to base-85 numbers (always 5 bytes) encoded as printable
ASCII characters. Decoding base-85 is the reverse operation. Note
that decoding may overflow on invalid input as 85^5 > 2^32. Encodings
always have a length that is a multiple of 5. If input length is not
divisible by 4, padding bytes are (logically) added, which are ignored
when decoding. The encoding for n bytes has thus always exactly length
(n + 3) / 4 * 5 in integer arithmetic (truncating division).
Includes tests.
[1] https://datatracker.ietf.org/doc/html/rfc1924
Bug: 371725
Change-Id: Ib5b9a503cd62cf70e080a4fb38c8cd1eeeaebcfe
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Rather than getting all ref names and prefixes and saving them
in memory to perform the check for conflicting names, rely on
RefDirectory.isNameConflicting as it is no longer an expensive
call after it was optimized in Ie994fc.
The old optimization to save ref names and prefixes in memory
was targeted towards making clones faster. With this change,
the clone performance is unaffected when tests were done with
repos containing many(~500k) refs.
Here are few recorded elapsed times for creating 10 branches
using BatchRefUpdate on NFS based repositories with varying
loose refs count. As seen here, this change helps improve the
BatchRefUpdate performance from O(n^2) to O(1).
loose_refs_count with_change without_change
50 241 ms 310 ms
300 263 ms 1502 ms
1k 181 ms 4241 ms
2k 204 ms 6440 ms
9k 158 ms 25930 ms
20k 154 ms 60443 ms
50k 171 ms 135199 ms
110k 157 ms 329450 ms
160k 209 ms 396328 ms
This update improves the Gerrit notedb migration performance
as it uses BatchRefUpdate to write change meta refs similar to
the test performed above.
Change-Id: I853ac6c7feb4b39c3156c01876b38cbd182accfe
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
Avoid having to scan over ALL loose refs to determine if the
name is nested within or is a container of an existing reference.
This can get really expensive if there are too many loose refs.
Instead use exactRef and getRefsByPrefix which scan based on a
prefix.
With a simple shell script(like below) using jgit client to create
1k refs in a new repository on NFS, this change brings down the time
from 12mins to 7mins.
for ref in $(seq 1 1000); do
jgit branch "$ref"
done
Here are few recorded elapsed times to create a new branch on NFS
based repositories with varying loose refs count. As we see here,
this change improves the name conflicting check from O(n^2) to O(1).
loose_refs_count with_change without_change
50 44 ms 164 ms
300 45 ms 1193 ms
1k 38 ms 2610 ms
2k 44 ms 6003 ms
9k 46 ms 27860 ms
20k 45 ms 48591 ms
50k 51 ms 135471 ms
110k 43 ms 294252 ms
160k 52 ms 430976 ms
Change-Id: Ie994fc184b8f82811bfb37b111eb9733dbe3e6e0
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
Applying a patch on Windows failed if the patch had the (normal)
single-LF line endings, but the file on disk had the usual Windows
CR-LF line endings.
Git (and JGit) compute diffs on the git-internal blob, i.e., after
CR-LF transformation and clean filtering. Applying patches to files
directly is thus incorrect and may fail if CR-LF settings don't
match, or if clean/smudge filtering is involved.
Change ApplyCommand to run the file content through the check-in
filters before applying the patch, and run the result through the
check-out filters. This makes patch application succeed even if the
patch has single-LFs, but the file has CR-LF and core.autocrlf is
true.
Add tests for various combinations of line endings in the file and in
the patch, and a test to verify the clean/smudge handling.
See also [1].
Running the file though clean/smudge may give strange results with
LFS-managed files. JGit's DiffFormatter has some extra code and
applies the smudge filter again after having run the file through
the check-in filters (CR-LF and clean). So JGit can actually produce
a diff on LFS-managed files using the normal diff machinery. (If it
doesn't run out of memory, that is. After all, LFS is intended for
_large_ files.) How such a diff would be applied with either C git
or JGit is entirely unclear; neither has any code for this special
case. Compare also [2].
Note that C git just doesn't know about LFS and always diffs after
the check-in filter chain, so for LFS files, it'll produce a diff
of the LFS pointers.
[1] https://github.com/git/git/commit/c24f3abac
[2] https://github.com/git-lfs/git-lfs/issues/440
Bug: 571585
Change-Id: I8f71ff26313b5773ff1da612b0938ad2f18751f5
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Use Character.isWhitespace() instead of Character.isSpaceChar() to
treat TABs as whitespace, too.
Change-Id: Iffc59c13357d981ede6a1e0feb6ea6ff03fb3064
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Negated patterns were handled wrongly. According to the OpenBSD
ssh_config man page,[1] a negated pattern never matches. Negated
patterns make only sense if there are positive patterns; the
negated pattern then can define exceptions for the positive
patterns.
OpenSshConfigFile did this wrongly. It handled "!foo" as "matching
everything but foo", but actually the semantics is "if the input is
"foo", this entry doesn't apply. If the input is anything else,
other patterns determine whether the entry may apply.".
[1] https://man.openbsd.org/ssh_config
Change-Id: I50f6e46581b7ece4c949eddf62f4a265573ec29e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
* stable-5.12:
Remove texts which were added by mistake in 00386272
Fix formatting which was broken in 00386272
Change-Id: I1c936183e1fa17ea95ada7849a75bc76af275fa3
* stable-5.11:
Remove texts which were added by mistake in 00386272
Fix formatting which was broken in 00386272
Change-Id: I6184772bdeca1b9ccecf6e400ae15604ab4f5a69
* stable-5.10:
Remove texts which were added by mistake in 00386272
Fix formatting which was broken in 00386272
Change-Id: I0f1511be5375716d41565e72b271cb956c3e847b
* stable-5.9:
Remove texts which were added by mistake in 00386272
Fix formatting which was broken in 00386272
Change-Id: Ifa135077d8d07d2317df3b479822e30d87eca950
* stable-5.8:
Remove texts which were added by mistake in 00386272
Fix formatting which was broken in 00386272
Change-Id: I9ca7a0237f87d1d4bcaba81e709eaa67902f27e5
* stable-5.7:
Remove texts which were added by mistake in 00386272
Fix formatting which was broken in 00386272
Change-Id: I7ed3f47cb46e6c1bf483702c8925a24e88658e47
* stable-5.6:
Remove texts which were added by mistake in 00386272
Fix formatting which was broken in 00386272
Change-Id: I45d444b360485564744bf3dfad2c2f5a5e7fcdf6
* stable-5.9:
LockFile: create OutputStream only when needed
Remove ReftableNumbersNotIncreasingException
Fix stamping to produce stable file timestamps
Change-Id: I056382d1d93f3e0a95838bdd1f0be89711c8a722
Don't create the stream eagerly in lock(); that may cause JGit to
exceed OS or JVM limits on open file descriptors if many locks need
to be created, for instance when creating many refs. Instead create
the output stream only when one really needs to write something.
Bug: 573328
Change-Id: If9441ed40494d46f594a896d34a5c4f56f91ebf4
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Don't create the stream eagerly in lock(); that may cause JGit to
exceed OS or JVM limits on open file descriptors if many locks need
to be created, for instance when creating many refs. Instead create
the output stream only when one really needs to write something.
Bug: 573328
Change-Id: If9441ed40494d46f594a896d34a5c4f56f91ebf4
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Add a constant in ConfigConstants, and a ConflictStyle enum in
MergeCommand.
Change-Id: Idf8e036b6b6953bec06d6923a39e5ff30c2da562
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Git has different conflict resolution strategies:
* There is a tree merge strategy "ours" which just ignores any changes
from theirs ("-s ours"). JGit also has the mirror strategy "theirs"
ignoring any changes from "ours". (This doesn't exist in C git.)
Adapt StashApplyCommand and CherrypickCommand to be able to use those
tree merge strategies.
* For the resolve/recursive tree merge strategies, there are content
conflict resolution strategies "ours" and "theirs", which resolve
any conflict hunks by taking the "ours" or "theirs" hunk. In C git
those correspond to "-Xours" or -Xtheirs". Implement that in
MergeAlgorithm, and add API to set and pass through such a strategy
for resolving content conflicts.
* The "ours/theirs" content conflict resolution strategies also apply
for binary files. Handle these cases in ResolveMerger.
Note that the content conflict resolution strategies ("-X ours/theirs")
do _not_ apply to modify/delete or delete/modify conflicts. Such
conflicts are always reported as conflicts by C git. They do apply,
however, if one side completely clears a file's content.
Bug: 501111
Change-Id: I2c9c170c61c440a2ab9c387991e7a0c3ab960e07
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Similar to https://git.eclipse.org/r/c/jgit/jgit/+/175166, ignore
path that have conflicts on attributes, so that the virtual base could
be used by RecursiveMerger.
Change-Id: I99c95445a305558d55bbb9c9e97446caaf61c154
Signed-off-by: Marija Savtchouk <mariasavtchouk@google.com>
o.e.j.ssh.apache produces passphrase prompts containing
InformationalMessage items to show the fingerprint of the key
the passphrase is being asked for. Allow this so that the credentials
provider can be used with o.e.j.ssh.apache.
Change-Id: Ibc2ffd3a987d3118952726091b9b80442972dfd8
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
OpenSSH 8.4 has introduced simple environment variable substitution
for some keys. Implement that feature in our ssh config file parser,
too.
Bug: 572103
Change-Id: I360f2c5510eea4ec3329aeedf3d29dfefc9163f0
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
* stable-5.11:
Refactor CommitCommand to improve readability
CommitCommand: fix formatting
CommitCommand: remove unncessary comment
Ensure post-commit hook is called after index lock was released
sshd: try all configured signature algorithms for a key
sshd: modernize ssh config file parsing
sshd: implement ssh config PubkeyAcceptedAlgorithms
Change-Id: Ic3235ffd84c9d7537a1fe5ff4f216578e6e26724
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
OpenSSH has changed some things in ssh config files. Update our parser
to implement some of these changes:
* ignore trailing comments on a line
* rename PubkeyAcceptedKeyTypes to PubkeyAcceptedAlgorithms
Note that for the rename, openSSH still accepts both names. We do the
same, translating names whenever we get or set values.
Change-Id: Icccca060e6a4350a7acf05ff9e260f2c8c60ee1a
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Apache MINA sshd 2.6.0 appears to use only the first appropriate
public key signature algorithm for a particular key. See [1]. For
RSA keys, that is rsa-sha2-512. This breaks authentication at servers
that only know the older (and deprecated) ssh-rsa algorithm.
With PubkeyAcceptedAlgorithms, users can re-order algorithms in
the ssh config file per host, if needed. Setting
PubkeyAcceptedAlgorithms ^ssh-rsa
will put "ssh-rsa" at the front of the list of algorithms, and then
authentication at such servers with RSA keys works again.
[1] https://issues.apache.org/jira/browse/SSHD-1105
Bug: 572056
Change-Id: I86c3b93f05960c68936e80642965815926bb2532
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
In [1], improved RevWalk.getMergedInto() is introduced to avoid repeated
work while performing RevWalk.isMergedInto() on many refs. Modify
findBranchesReachableFrom() to use it.
[1] I65de9873dce67af9c415d1d236bf52d31b67e8fe
Change-Id: I81d615241638d4093df64b449637af601843a5ed
Signed-off-by: Adithya Chakilam <quic_achakila@quicinc.com>
In cases where we need to determine if a given commit is merged
into many refs, using isMergedInto(base, tip) for each ref would
cause multiple unwanted walks.
getMergedInto() marks the unreachable commits as uninteresting
which would then avoid walking that same path again.
Using the same api, also introduce isMergedIntoAny() and
isMergedIntoAll()
Change-Id: I65de9873dce67af9c415d1d236bf52d31b67e8fe
Signed-off-by: Adithya Chakilam <quic_achakila@quicinc.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
There are two code paths for detecting renames: one on tree diffs
(using DiffFormatter#scan) and the other on single file diffs (using
DiffFormatter#format). The latter skips binary and large files
for rename detection - check [1], but the former doesn't.
This change skips content rename detection for the tree diffs case for
large files. This is essential to avoid expensive computations while
reading the file, especially for callers who don't want to pay that
cost. Content renames are those which involve files with slightly
modified content. Exact renames will still be identified.
The default threshold for file sizes is reused from
PackConfig.DEFAULT_BIG_FILE_THRESHOLD: 50 MB.
[1] 232876421d/org.eclipse.jgit/src/org/eclipse/jgit/diff/RawText.java (386)
Change-Id: Idbc2c29bd381c6e387185204638f76fda47df41e
Signed-off-by: Youssef Elghareeb <ghareeb@google.com>
Git config http.cookieFile must have ~ expansion, compare [1].
It also should be an absolute path. While a relative path is allowed,
C git just passes the value on to libcurl, so it'll be relative to the
current working directory and thus not work in all directories.
Log a warning if the path is relative.
(Alternatives would be to throw an exception, or to resolve the path
relative to the .git directory, or relative to the working tree root,
or relative to the config file it occurs in. But C git does not seem
to do either.)
[1] https://github.com/git/git/commit/e5a39ad8e
Bug: 571798
Change-Id: I5cdab6061d0613ac7d8cb7977e5b97f5b88f562d
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Add new constructors to PackFile to improve a common use case where
callers know the directory, id, and extension, but previously needed to
construct a valid file name (with prefix, '.', etc) to create a
PackFile. Most callers can use the variant that has id as an ObjectId,
but provide an id as String variant too.
Change-Id: I39e4466abe8c9509f5916d5bfe675066570b8585
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
It's easier to follow the logic here when we can use our own objects
instead of Strings.
Change-Id: I6a166edcc67903fc1ca3544f458634c4cef8fde7
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
This class already looked very much like an Enum, but wasn't one.
As an Enum, we can use PackExt in EnumMaps and EnumSets. Convert the
Map key usage in PackDirectory to an EnumMap.
Change-Id: Ice097fd468a05805f914e6862fbd1d96ec8c45d1
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
Provide a recovery path for objects being referenced during the pack
pruning race. Due to the pack pruning race, it is possible for objects
to become referenced after a pack has been deemed safe to prune, but
before it actually gets pruned. If this happened previously, the newly
referenced objects would be missing and potentially result in a
corrupted ref.
Add the ability to recover from this situation when an object is missing
but happens to still be available in a pack in the "preserved"
directory. This is likely only useful when used in conjunction with the
--preserve-old-packs GC option, which prunes packs by hard-linking to
the preserved directory. If an object is missing and found in a pack in
the preserved directory, immediately recover that pack and its
associated files (idx, bitmaps...) by moving them back to the original
pack directory, and then retry the operation that would have failed due
to the missing object. This retry can now succeed and the repository
may avoid corruption. This approach should drastically reduce the
chance of a corrupt repository during pack pruning at very little extra
cost. This extra cost should only be incurred when objects are missing
and a failure would normally occur.
Change-Id: I2a704e3276b88cc892159d9bfe2455c6eec64252
Signed-off-by: Martin Fick <quic_mfick@quicinc.com>
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
The only extension that was ever consulted from the bitmap was the
bitmap index. We can simplify the Pack code as well as the code of
all the callers if we focus on just that usage.
Change-Id: I799ddfdee93142af67ce5081d14a430d36aa4c15
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
Update scanPacksImpl and listPackDirectory (renamed to
getPackFilesByExtById) to use the new PackFile functionality to
validate file names and complete pack file sets (.pack, .idx, etc).
Most importantly, this allows a later change to rely on scanPacks() to
complete a packList that contains packs with the 'old-' prefix in their
extension.
This also eliminates duplication of logic for how to identify and
construct pack files.
Change-Id: I7175e5fefb187a29e0a7cf53c392aee922314f31
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
GC has several places where it tries to build files names for packs that
we can use the PackFile class for instead.
Change-Id: I99e5ceff9050f8583368fca35279251955e4644d
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
The PackFile class is intended to be a central place to do all
common pack filename manipulation and parsing to help reduce repeated
code and bugs. Use the PackFile class in the Pack class and in many
tests to ensure it works well in a variety of situations. Later changes
will expand use of PackFiles to even more areas.
Change-Id: I921b30f865759162bae46ddd2c6d669de06add4a
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
A cookie file stores the expiration in seconds since the Linux Epoch,
not in milliseconds. Correct reading and writing cookie files; with
a backwards-compatibility hack to read files that contain a millisecond
timestamp.
Add a test, and fix tests not to rely on the actual current time so
that they will also run successfully after 2030-01-01 noon.
Bug: 571574
Change-Id: If3ba68391e574520701cdee119544eedc42a1ff2
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
In a distributed setting, one can have multiple datacenters use
reftables for serving, while the ground truth for the Ref database is
administered centrally. In this setting, replication delays combined
with compaction can cause update-index ranges to overlap.
Such a setting is used at Google, and the JGit code already handles
this correctly (modulo a bugfix that applied in change I8f8215b99a).
Remove the restriction that was applied at FileReftableDatabase.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Change-Id: I6f9ed0fbd7fbc5220083ab808b22a909215f13a9
Include the full file path of the .gitignore file and the line number
of the invalid pattern. Also include the pattern itself.
.gitignore files inside the repository are reported with their
repository-relative path; files outside (from git config
core.excludesFile or .git/info/exclude) are reported with their
full absolute path.
Bug: 571143
Change-Id: Ibe5969679bc22cff923c62e3ab9801d90d6d06d1
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
When a .gitignore pattern cannot be parsed include the pattern in the
log message. Just reporting "not closed bracket" isn't helpful if the
user doesn't know in which pattern the problem occurred.
Even better would be to include the full path of the .gitignore file
that contained the offending pattern. This is not implemented in this
change; it may need new API and needs more thought.
Bug: 571143
Change-Id: Id5b16d9cf550544ba3ad409a02041946fa8516ab
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
We introduced the option --initial-branch=<branch-name> to allow
initializing a new repository with a different initial branch.
To allow users to override the initial branch name more permanently
(i.e. without having to specify the name manually for each 'git init'),
introduce the 'init.defaultBranch' option.
This option was added to git in 2.28.0.
See https://git-scm.com/docs/git-config#Documentation/git-config.txt-initdefaultBranch
Bug: 564794
Change-Id: I679b14057a54cd3d19e44460c4a5bd3a368ec848
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add option --initial-branch/-b to InitCommand and the CLI init command.
This is the first step to implement support for the new option
init.defaultBranch. Both were added to git in release 2.28.
See https://git-scm.com/docs/git-init#Documentation/git-init.txt--bltbranch-namegt
Bug: 564794
Change-Id: Ia383b3f90b5549db80f99b2310450a7faf6bce4c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
jgit clone --branch foo <url>
did not fail if the remote branch "foo" didn't exist in the remote
repository being cloned.
Bug: 546580
Change-Id: I55648ad3a39da4a5711dfa8e6d6682bb8190a6d6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
`copy` is documented as possibly returning a smaller number of bytes
than requested. In practice, this can occur if a block is cached and the
reader never pulls in the file to check its size.
Bug: 565874
Change-Id: I1e53b3d2f4ab09334178934dc0ef74ea99045cd3
Signed-off-by: wh <wh9692@protonmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add a GpgSignatureVerifier interface, plus a factory to create
instances thereof that is provided via the ServiceLoader mechanism.
Implement the new interface for BouncyCastle. A verifier maintains
an internal LRU cache of previously found public keys to speed up
verifying multiple objects (tag or commits). Mergetags are not handled.
Provide a new VerifySignatureCommand in org.eclipse.jgit.api together
with a factory method Git.verifySignature(). The command can verify
signatures on tags or commits, and can be limited to accept only tags
or commits. Provide a new public WrongObjectTypeException thrown when
the command is limited to either tags or commits and a name resolves
to some other object kind.
In jgit.pgm, implement "git tag -v", "git log --show-signature", and
"git show --show-signature". The output is similar to command-line
gpg invoked via git, but not identical. In particular, lines are not
prefixed by "gpg:" but by "bc:".
Trust levels for public keys are read from the keys' trust packets,
not from GPG's internal trust database. A trust packet may or may
not be set. Command-line GPG produces more warning lines depending
on the trust level, warning about keys with a trust level below
"full".
There are no unit tests because JGit still doesn't have any setup to
do signing unit tests; this would require at least a faked .gpg
directory with pre-created key rings and keys, and a way to make the
BouncyCastle classes use that directory instead of the default. See
bug 547538 and also bug 544847.
Tested manually with a small test repository containing signed and
unsigned commits and tags, with signatures made with different keys
and made by command-line git using GPG 2.2.25 and by JGit using
BouncyCastle 1.65.
Bug: 547751
Change-Id: If7e34aeed6ca6636a92bf774d893d98f6d459181
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
As the post commit hook is run after a commit is finished, it can not
abort the commit and the exit code of this hook should not have any
effect.
This can be achieved by not throwing a AbortedByHookException exception.
The stderr output is not lost thanks to contributions for bug 553471.
Bug: 553428
Change-Id: I451a76e04103e632ff44e045561c5a41f7b7d558
Signed-off-by: Tim Neumann <Tim.Neumann@advantest.com>
Signed-off-by: Fabian Pfaff <fabian.pfaff@vogella.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
EGit wants to add gitflow specific hooks in org.eclipse.egit.gitflow.
Make GitHook public to allow sub-classing outside of the
org.eclipse.jgit.hooks package.
Change-Id: I439575ec901e3610b5cf9d66f7641c8324faa865
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
External scripts most probably expect the default charset.
Change-Id: I318a5e1d9f536a95e70c06ffb5b6f408cd40f73a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Pack better represents the purpose of the object and paves the way to
add a PackFile object that extends File.
Change-Id: I39b4f697902d395e9b6df5e8ce53078ce72fcea3
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
If RecursiveMerger finds multiple base commits, it tries to compute
the virtual ancestor to use as a base for the three way merge.
Currently, the content conflicts between ancestors are ignored (file
staged with the conflict markers). If the path is a file in one ancestor
and a dir in the other, it results in NoMergeBaseException
(CONFLICTS_DURING_MERGE_BASE_CALCULATION).
Allow these conflicts by ignoring this unmerged path in the virtual
base. The merger will compute diff in the children instead and it
can be further fixed manually if needed.
Change-Id: Id59648ae1d6bdf300b26fff513c3204317b755ab
Signed-off-by: Marija Savtchouk <mariasavtchouk@google.com>
Subclasses can use the corresponding getter methods.
Change-Id: Iaa9ab01f5a9731a264b28608d2418a9405b601d7
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add it to the GpgConfig. Change GpgConfig to load the values once only.
Add a parameter to the GpgObjectSigner interface's operations to pass
in a GpgConfig. Update CommitCommand and TagCommand to pass the value
to the signer. Let the signer decide whether it can actually produce
the wanted signature type (openpgp or x509).
No behavior change. But this makes it possible to implement different
signers that might support x509 signatures, or use gpg.program and
shell out to an external GPG executable for signing.
Change-Id: I427f83eb1ece81c310e1cddd85315f6f88cc99ea
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
DateRevQueue is expected to give out the commits that have higher
commit time. But in case of tie(same commit time), it should give
the commit that is inserted first. This is inferred from the
testInsertTie test case written for DateRevQueue. Also that test
case, right now uses just two commits which caused it not to fail
with the current implementation, so added another commit to make
the test more robust.
By fixing the DateRevQueue, we would also match the behaviour of
LogCommand.addRange(c1,c2) with git log c1..c2. A test case for
the same is added to show that current behaviour is not the
expected one.
By fixing addRange(), the order in which commits are applied during
a rebase is altered. Rebase logic should have never depended upon
LogCommand.addRange() since the intended order of addRange() is not
the order a rebase should use. So, modify the RebaseCommand to use
RevWalk directly with TopoNonIntermixSortGenerator.
Add a new LogCommandTest.addRangeWithMerge() test case which creates
commits in the following order:
A - B - C - M
\ /
-D-
Using git 2.30.0, git log B..M outputs: M C D
LogCommand.addRange(B, M) without this fix outputs: M D C
LogCommand.addRange(B, M) with this fix outputs: M C D
Change-Id: I30cc3ba6c97f0960f64e9e021df96ff276f63db7
Signed-off-by: Adithya Chakilam <achakila@codeaurora.org>
Keeping the field updateDate is unecessary, as it is set and used only
in the doRename method.
Change-Id: I1cdd1adf759b75c103480db7a74cec8c2d78b794
Signed-off-by: Lars Vogel <Lars.Vogel@vogella.com>
Deleting non-existing files when converting to reftable without backup
caused convertToReftable to fail. Observed this on a mirrored repository
which had no reflogs. Fix this by skipping missing files during
deletion.
Change-Id: I3bb913d5bfddccc6813677b873006efb849a6ebc
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
before connecting.
A socket gets bound on connect in the next line.
Signed-off-by: Alina Djamankulova <adjama@google.com>
Change-Id: I69a423c592e2fdd582b3c40099137b4ef3d05b39
This would run into an endless loop if the offset given was not zero.
Fix the logic to exit the read loop when the buffer is full.
Luckily all existing uses of this method call it only with offset zero.
Change-Id: I0ec2a4fb43efe4a605d06ac2e88cf155d50e2f1e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Reachability checkers are retrieved from RevWalk and ObjectWalk objects:
* RevWalk.createReachabilityChecker()
* ObjectWalk.createObjectReachabilityChecker()
Since RevWalks and ObjectWalks are themselves directly instantiated
in hundreds of places (e.g. UploadPack...) overriding them in a
consistent way requires overloading 100s of methods, which isn't
feasible. Moving reachability checker generation to a more central
place solves that problem.
The ObjectReader object seems a good place from which to get
reachability checkers, because reachability checkers return
information about relationships between objects. ObjectDatabases
delegate many operations to ObjectReaders, and reachability bitmaps
are attached to ObjectReaders.
The Bitmapped and Pedestrian reachability checker objects were
package private in the org.eclipse.jgit.revwalk package. This change
makes them public and moves them to the
org.eclipse.jgit.internal.revwalk package. Corresponding tests are
also moved.
Motivation:
1) Reachability checking algorithms need to scale. One of the
internal Android repositories has ~2.4 million refs/changes/*
references, causing bad long tail performance in reachability
checks.
2) Reachability check performance is impacted by repository
topography: number of refs, number of objects, amounts of
related vs. unrelated history.
3) Reachability check performance is also affected by per-branch
access (Gerrit branch permissions) since different users can
see different branches.
4) Reachability check performance isn't affected by any state in a
RevWalk or ObjectWalk.
I don't yet know if a single algorithm will work for all cases in #2
and #3. We may need to evolve the ReachabilityChecker interfaces
over time to solve the Gerrit branch permissions case, or use
Gerrit-specific identity information to solve that in an efficient
way.
This change takes the existing public API and moves it to the
ObjectReader/whole repository level, which is where we can do
consistent customizations for #2 and #3. We intend to upstream the
best of whatever works, but anticipate the need for multiple rounds
of experimentation.
Change-Id: I9185feff43551fb387957c436112d5250486833d
Signed-off-by: Terry Parker <tparker@google.com>
* changes:
Compare getting all refs except specific refs with seek and with filter
Add getsRefsByPrefixWithSkips (excluding prefixes) to ReftableDatabase
Add seekPastPrefix method to RefCursor
We sometimes want to get all the refs except specific prefixes,
similarly to getRefsByPrefix that gets all the refs of a specific
prefix.
We now create a new method that gets all refs matching a prefix except a
set of specific prefixes.
One use-case is for Gerrit to be able to get all the refs except
refs/changes; in Gerrit we often have lots of refs/changes, but very
little other refs. Currently, to get all the refs except refs/changes we
need to get all the refs and then filter the refs/changes, which is very
inefficient. With this method, we can simply skip the unneeded prefix so
that we don't have to go over all the elements.
RefDirectory still uses the inefficient implementation, since there
isn't a simple way to use Refcursor to achieve the efficient
implementation (as done in ReftableDatabase).
Signed-off-by: Gal Paikin <paiking@google.com>
Change-Id: I8c5db581acdeb6698e3d3a2abde8da32f70c854c
Adds a new FileUtils.hasFiles(Path) helper method to correctly handle
the Files.list returned Stream.
These errors were found by compiling the code using JDK11's
javac compiler.
Change-Id: Ie8017fa54eb56afc2e939a2988d8b2c5032cd00f
Signed-off-by: Terry Parker <tparker@google.com>
This method will be used by the follow-up change. This useful if we want
to go over all the changes after a specific ref.
For example, the new method allows us to create a follow-up that would
go over all the refs until we reach a specific ref (e.g refs/changes/),
and then we use seekPastPrefix(refs/changes/) to read the rest of the refs,
thus basically we return all refs except a specific prefix.
When seeking past a prefix, the previous condition that created the
RefCursor still applies. E.g, if the cursor was created by
seekRefsWithPrefix, we can skip some refs but we will not return refs
that are not starting with this prefix.
Signed-off-by: Gal Paikin <paiking@google.com>
Change-Id: I2c02e89c877fe90da8619cb8a4a9a0c865f238ef
In some circumstances (eg. compacting a stack that has deletions), the
result may have a {min, max} range that already exists. In these
cases, we would rename onto an already existing file, which does not
work on Windows. By adding a random suffix, we disambiguate the files,
and avoid this failure scenario.
Change-Id: I0273f99bb845cfbdbd8cdd582b55d3c310505d29
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Heap always copied whole blocks, which leads to AIOOBEs. LocalFile
didn't overwrite the method and thus caused NPEs.
Change-Id: Ia37d4a875df9f25d4825e6bc95fed7f0dff42afb
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
If the caller knows already HTTP Basic authentication will be needed
and if it also already has the username and password, preemptive
authentication is a little bit more efficient since it avoids the
initial 401 response.
Add a setPreemptiveBasicAuthentication(username, password) method
to TransportHttp. Client code could call this for instance in a
TransportConfigCallback. The method throws an IllegalStateException
if it is called after an HTTP request has already been made.
Additionally, a URI can include userinfo. Although it is not
recommended to put passwords in URIs, JGit's URIish and also the
Java URL and URI classes still allow it. The underlying HTTP
connection may omit these fields though. If present, take these
fields as additional source for preemptive Basic authentication if
setPreemptiveBasicAuthentication() has not been called.
No preemptive authentication will be done if the connection is
redirected to a different host.
Add tests.
Bug: 541327
Change-Id: Id00b975e56a15b532de96f7bbce48106d992a22b
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
TransportHttp makes several HTTP requests. The SSLContext and socket
factory must be shared over these requests, otherwise authentication
information may not be propagated correctly from one request to the
next. This is important for authentication mechanisms that rely on
client-side state, like NEGOTIATE (either NTLM, if the underlying HTTP
library supports it, or Kerberos). In particular, SPNEGO cannot
authenticate on a POST request; the authentication must come from the
initial GET request, which implies that the POST request must use the
same SSLContext and socket factory that was used for the GET.
Change the way HTTPS connections are configured. Introduce the concept
of a GitSession, which is a client-side HTTP session over several HTTPS
requests. TransportHttp creates such a session and uses it to configure
all HTTP requests during that session (fetch or push). This gives a way
to abstract away the differences between JDK and Apache HTTP connections
and to configure SSL setup outside.
A GitSession can maintain state and thus give all HTTP requests in a
session the same socket factory.
Introduce an extension interface HttpConnectionFactory2 that adds a
method to obtain a new GitSession. Implement this for both existing
HTTP connection factories. Change TransportHttp to use the new
GitSession to configure HTTP connections.
The old methods for disabling SSL verification still exist to support
possibly external connection and connection factory implementations
that do not make use of the new GitSession yet.
Bug: 535850
Change-Id: Iedf67464e4e353c1883447c13c86b5a838e678f1
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Previously, TransportHttp always used the globally set connection
factory. This is problematic if that global factory is changed in
the middle of a fetch or push operation. Initialize the factory to
use in the constructor, then use that factory for all HTTP requests
made through this transport. Provide a setter and a getter for it
so that client code can customize the factory, if needed, in a
TransportConfigCallback.
Once a factory has been used on a TransportHttp instance it cannot
be changed anymore.
Make the global static factory reference volatile.
Change-Id: I7c6ee16680407d3724e901c426db174a3125ba1c
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
UploadPack may log ACKs in protocol V2 that it doesn't send (if it
got a "done" from the client), or may log ACKs twice. That makes
packet log analysis difficult.
Add a new constructor to PacketLineOut to omit all logging from an
instance, and use it in UploadPack.
Change-Id: Ic29ef5f9a05cbcf5f4858a4e1b206ef0e6421c65
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>