Commit Graph

5501 Commits

Author SHA1 Message Date
Thomas Wolf eec9b55dcf FS: don't cache fallback if running in background
If the background job is a little late, the true result might
arrive and be cached later. So make sure we don't cache the large
fallback resolution in the per-directory cache. Otherwise we'd work
with the large fallback until the next restart.

Bug: 566170
Change-Id: I7354a6cfddfc0c05144bb0aa41c23029bd4f6af0
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-20 22:17:40 +02:00
Thomas Wolf efd1cc05af Keep line endings for text files committed with CR/LF on text=auto
Git never converts line endings if the version in the repository is a
text file with CR/LF and text=auto. See [1]: "When the file has been
committed with CRLF, no conversion is done."

Because the sentence just before is about converting line endings on
check-in, I had understood that in commit 60cf85a [2] to mean that no
conversion on check-in was to be done. However, as bug 565048 and a
code inspection of the C git code showed it really means no conversion
is done on check-in *or check-out*.

If the text attribute is not set but core.autocrlf = true, this is
the same as text=auto eol=crlf. C git does not convert on check-out
even on text=auto eol=lf if the index version is a text file with
CR/LF.

For check-in, one has to look at the intended target, which is done
in WorkingTreeIterator since commit 60cf85a. For check-out, it can
be done by looking at the source and can thus be done in the
AutoLFOutputStream.

Additionally, provide a constructor for AutoLFInputStream to do
the same; for cases where the equivalent of a check-out is done via
an input stream obtained from a blob. (EGit does that in its
GitBlobStorage for the Eclipse compare framework; it's more efficient
than using a TemporaryBuffer and DirCacheCheckout.getContent(), and
it avoids the need for a temporary file.)

Adapt existing tests, and add new checkout and merge tests to verify
the resulting files have the correct line endings.

EGit's GitBlobStorage will need to call the new version of
EolStreamTypeUtil.wrapInputStream().

[1] https://git-scm.com/docs/gitattributes#Documentation/gitattributes.txt-Settostringvalueauto
[2] https://git.eclipse.org/r/c/jgit/jgit/+/127324

Bug: 565048
Change-Id: If1282ef43e2abd00263541bd10a01fe1f5c619fc
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-17 08:52:55 +02:00
Thomas Wolf 71aeedb6ec Delay WindowCache statistics JMX MBean registration
The WindowCache is configured statically with a default
WindowCacheConfig. The default config says (for backwards
compatibility reasons) to publish the MBean. As a result,
the MBean always gets published.

By delaying the MBean registration until the first call to
getInstance() or get(PackFile, long) we can avoid the forced
registration and do it only if not re-configured in the meantime
not to publish the bean. (As is done by Egit, to avoid a very
early costly access to the user and system config during plug-in
activation.)

Bug: 563740
Change-Id: I8a941342c0833acee2107515e64299aada7e0520
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-16 14:41:21 +02:00
Thomas Wolf e9cb0a8e47 DirCache: support index V4
Index format version 4 was introduced in C git in 2012. It's about
time that JGit can deal with it.

Version 4 added prefix path compression. Instead of writing the full
path for each index entry to disk, only the difference to the previous
entry's path is written: a variable-encoded int telling how many bytes
to remove from the previous entry's path to get the common prefix,
followed by the new suffix.

Also, cache entries in a version 4 index are not padded anymore.

Internally, version 3 and version 4 index entries are identical; it's
only the stored format that changes.

Implement this path compression, and make sure we write an index file
that we read previously in the same format. (Only changing from version
2 to version 3 if there are extended flags.)

Add support for the "feature.manyFiles" and the "index.version" git
configs, and honor them when writing a new index file.

Add tests, including a compatibility test that verifies that JGit can
read a version 4 index generated by C git and write an identical
version 4 index.

Bug: 565774
Change-Id: Id83241cf009e50f950eb42f8d56b834fb47da1ed
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-15 12:47:45 +02:00
Thomas Wolf 72b111ecd7 Update javadoc for RemoteSession and SshSessionFactory
The timeout on RemoteSession.exec() cannot be a timeout for the
whole command. It can only be a timeout for setting up the process;
after that it's the application's responsibility to implement some
timeout for the execution of the command, for instance by calling
Process.waitFor(int, TimeUnit) or through other means.

Sessions returned by an SshSessionFactory are already connected and
authenticated -- they must be, because RemoteSession offers no
operations for connecting or authenticating a session.

Change the implementation of SshdExecProcess.waitFor() to wait
indefinitely. The original implementation used the timeout from
RemoteSession.exec() because of that erroneous javadoc.

Change-Id: I3c7ede24ab66d4c81f72d178ce5012d383cd826e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-10 22:51:34 +02:00
Thomas Wolf 24fdc1d039 Fix JSchProcess.waitFor() with time-out
SshSupport.runSshCommand() had a comment that wait with time-out
could not be used because JSchProcess.exitValue() threw the wrong
unchecked exception when the process was still running.

Fix this and make JSchProcess.exitValue() throw the right exception,
then wait with a time-out in SshSupport.

The Apache sshd client's SshdExecProcess has always used the correct
IllegalThreadStateException.

Add tests for SshSupport.runCommand().

Change-Id: Id30893174ae8be3b9a16119674049337b0cf4381
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-10 22:51:34 +02:00
Jonathan Nieder 86aa6deff4 FilterSpec: Use BigInteger.ZERO instead of valueOf(0)
This just simplifies a bit by avoiding an unneeded method call.

Change-Id: I6d8d2fc512d8f8a82da73c355017d0abf833a13b
2020-07-31 19:01:35 -07:00
Jonathan Nieder 3c807e0158 Do not send empty blob in response to blob:none filter
If I create a repository containing an empty file and clone it
with

	git clone --no-checkout --filter=blob:none \
		https://url/of/repository

then I would expect no blobs to be transferred over the wire.  Alas,
JGit rewrites filter=blob:none to filter=blob:limit=0, so if the
repository contains an empty file then the empty blob gets
transferred.

Fix it by teaching JGit about filters based on object type to
complement the existing filters based on object size.  This prepares
us for other future filters such as object:none.

In particular, this means we do not need to look up the size of the
filtered blobs, which should speed up clones.  Noticed by Anna
Pologova and Terry Parker.

Change-Id: Id4b234921a190c108d8be2c87f54dcbfa811602a
Signed-off-by: Jonathan Nieder <jrn@google.com>
2020-07-29 21:04:20 -07:00
Jonathan Nieder dceedbcd6e Add support for tree filters when fetching
Teach the FilterSpec serialization code about tree filters so they can
be communicated over the wire and understood by the server.

While we're here, harden the FilterSpec serialization code to throw
IllegalStateException if we encounter a FilterSpec that cannot be
expressed as a "filter" line.  The only public API for creating a
Filterspec is to pass in a "filter" line to be parsed, so these should
not appear in practice.

Change-Id: I9664844059ffbc9c36eb829e2d860f198b9403a0
Signed-off-by: Jonathan Nieder <jrn@google.com>
2020-07-29 20:52:12 -07:00
Thomas Wolf 9fe5406119 FS_POSIX: avoid prompt to install the XCode tools on OS X
OS X ships with a default /usr/bin/git that is just a wrapper that
at run-time delegates to the selected XCode toolchain, and that
prompts the user to install the XCode command line tools if not
already installed.

This is annoying for people who don't want to do so, since they'll
be prompted on each Eclipse start. Also, since on OS X the $PATH for
applications started via the GUI is not the same as the $PATH as set
via the shell profile, just using /usr/bin/git (which will normally
be found when JGit runs inside Eclipse) may give slightly surprising
results if the user has installed a non-Apple git and changed his
$PATH in the shell such that the non-Apple git is used in the shell.
(For instance by placing /usr/local/bin earlier on the path.) Eclipse
and the shell will use different git executables, and thus different
git system configs.

Therefore, try to find git via bash --login -c 'which git' not only
if we couldn't find it on $PATH but also if we found the default git
/usr/bin/git. If that finds some other git, use that. If the bash
approach also finds /usr/bin/git, double check via xcode-select -p
that an XCode git is present. If not, assume there is no git installed,
and work without any system config.

Bug: 564372
Change-Id: Ie9d010ebd9437a491ba5d92b4ffd1860c203f8ca
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-07-26 15:38:48 -04:00
Matthias Sohn 097f01bfb6 Use LinkedBlockingQueue for executor determining filesystem attributes
Using a fixed thread pool with unbounded LinkedBlockingQueue fixes the
RejectedExecutionException thrown if too many threads try to
concurrently determine filesystem attributes.

Comparing that to an alternative implementation using an unbounded
thread pool instead showed similar performance with the reproducer (in
range of 100-1000 threads in reproducer) on my mac:

threads   time

fixed threadpool up to 5 threads with LinkedBlockingQueue of unlimited
queue size
100       1103 ms
200       1602 ms
300       2369 ms
500       4002 ms
1000      11071 ms

unbounded cached threadpool
100       1108 ms
200       1591 ms
300       2299 ms
500       4577 ms
1000      11196 ms

Bug: 564202
Change-Id: I773da7414a1dca8e548349442dca9b56643be946
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-07-24 00:31:03 +02:00
David Ostrovsky d35f0ffb7c Bazel: Add workspace status command to stamp final artifact
Include implementation version in jgit library. This version is used
by other products that depend on JGit, and built using Bazel and not
consume officially released artifact from Central or Eclipse own Maven
repository.

Most notably, in Gerrit Code Review JGit agent that was previously
reported as "unknown", is now reported as:

  JGit/v5.8.0.202006091008-r-16-g14c43828d

using this change [1].

[1] https://gerrit-review.googlesource.com/c/gerrit/+/272505

Change-Id: Ia50de9ac35b8dbe9e92d8ad7d0d14cd00f057863
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
2020-07-17 01:10:15 +02:00
Thomas Wolf 5332723729 DiffFormatter: correctly deal with tracked files in ignored folders
In JGit 5.0, the FileTreeIterator was changed to skip ignored folders
by default. To catch tracked files inside ignored folders, the tree
walk needs to have a DirCacheIterator, and the FileTreeIterator has
to know about that DirCacheIterator via setDirCacheIterator(). (Or
the optimization has to be switched off explicitly via
setWalkIgnoredDirectories(true).)

Skipping ignored directories is an important optimization in some
cases, for instance in node.js/npm projects, where we'd otherwise
traverse the whole huge and deep hierarchy of the typically ignored
node_modules folder.

While all uses of WorkingTreeIterator in JGit had been adapted,
DiffFormatter was forgotten. To make it work correctly (again) also
for such cases, make it set up a WorkingTreeeIterator automatically,
and make sure the WorkingTreeSource can find such files, too. Also
pass the repository to the TreeWalks used inside the DiffFormatter
to pick up the correct attributes, filters, and line-ending settings.

Bug: 565081
Change-Id: Ie88ac81166dc396ba28b83313964c1712b6ca199
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-07-17 00:50:24 +02:00
Thomas Wolf 9b033a1b6d Fix writing GPG signatures with trailing newline
Make sure we don't produce a spurious empty line at the end.

Bug: 564428
Change-Id: Ib991d93fbd052baca65d32a7842f07f9ddeb8130
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-07-08 09:28:29 +02:00
David Pursehouse 8774f54190 Improve error message when receive.maxCommandBytes is exceeded
The message "Too many commands" implies there is a hard limit on the
number of commands, which isn't the case. The limit is on the total
size of the received data, as explained in change I84317d396 which
introduced the configuration setting receive.maxCommandBytes:

  shorter reference names allow for more commands, longer reference
  names permit fewer commands per batch.

Change the message to:

  Commands size exceeds limit defined in receive.maxCommandBytes

Change-Id: I678b78f919b2fec8f8058f3403f2541c26a5d00e
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2020-06-29 08:57:42 +09:00
Minh Thai 9719ca411e MergedReftable: Include the last reftable in determining minUpdateIndex
MergedReftable ignores the last reftable in the stack while calculating the
minUpdateIndex.

Update the loop indices to include all reftables in the minUpdateIndex
calculation, while skipping position 0 as it is read outside the loop.

Change-Id: I12d3e714581e93d178be79c02408a67ab2bd838e
Signed-off-by: Minh Thai <mthai@google.com>
2020-06-22 17:14:35 -07:00
Yunjie Li b94758441d PackBitmapIndex: Not buffer inflated bitmap during bitmap creation.
Currently we're buffering the inflated bitmap entry in
BasePackBitmapIndex to optimize running time. However, this will use
lots of memory during the creation of the pack bitmap index file.

And change 161456, which rewrote the entire getBitmap method, increased
the fetch latency significantly.

This commit introduces getBitmapWithoutCaching method which is used in
the pack bitmap index file creation only and aims to save memory during
garbage collection and not increase fetch latency.

Change-Id: I7b982c9d4e38f5f6193eaa03894e894ba992b33b
Signed-off-by: Yunjie Li <yunjieli@google.com>
2020-06-18 12:36:42 -07:00
Matthias Sohn 13802cd592 Remove trailing whitespace
Change-Id: I1635b79c8051699a3a5e78a4cef8d2014be4e92c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-06-10 10:48:55 +02:00
Matthias Sohn 855842af19 Prepare 5.9.0-SNAPSHOT builds
Change-Id: Ia998e2772df1285a4c674b07201f15d53156eb78
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-06-09 12:51:12 +02:00
Matthias Sohn 001d747419 Merge "Merge branch 'stable-5.7'" 2020-06-05 16:56:32 -04:00
Jack Wickham 259d2540a3 Add getter for unpackErrorHandler in ReceivePack
The current mechanism for updating the unpack error handler requires
that the error handler is replaced entirely, including communicating
the error to the user. Adding a getter means that delegating
implementations can be constructed so that the error can be processed
before sending to the user, for example for logging.

Change-Id: I4b6f78a041d0f6f5b4076a9a5781565ca3857817
Signed-off-by: Jack Wickham <jwickham@palantir.com>
2020-06-05 15:25:22 -04:00
David Pursehouse 1b4b05d4a3 Merge branch 'stable-5.7'
* stable-5.7:
  ObjectDirectoryInserter: Open FileOutputStream in try-with-resource
  ObjectDirectoryInserter: Remove redundant 'throws' declarations
  ObjectDirectory: Further clean up insertUnpackedObject
  ObjectDirectory: Explicitly handle NoSuchFileException
  ObjectDirectory: Fail immediately when atomic move is not supported

Change-Id: I05186baa517388680fcc6825c940c4c772f26d32
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2020-06-05 17:09:38 +09:00
David Pursehouse c0c7f445f4 ObjectDirectoryInserter: Open FileOutputStream in try-with-resource
Change-Id: Icc569aeefdc79baee5dfb71fb34d881c561dcf52
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2020-06-05 14:59:54 +09:00
David Pursehouse f4f5d448b6 ObjectDirectoryInserter: Remove redundant 'throws' declarations
ObjectWritingException and FileNotFoundException are subclasses
of IOException, which is already declared. Error does not need
to be explicitly declared.

Change-Id: I879820a33e10ec3a7ef676adc9c9148d2b3c4b27
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2020-06-05 14:48:15 +09:00
David Pursehouse dac6801b47 ObjectDirectory: Further clean up insertUnpackedObject
- The code to move the file is repeated. Split it out into a
  utility method.

- Remove the catch block for AtomicMoveNotSupportedException which
  is redundant because it's handled in exactly the same way as the
  IOException further down. The only exception we need to explicitly
  handle differently in this block is NoSuchFileException.

- Improve the comments.

Change-Id: Ifc5490953ffb25ecd1c48a06289eccb3f19910c6
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2020-06-05 11:40:08 +09:00
Matthias Sohn aea95b819a Add Git#shutdown for releasing resources held by JGit process
The shutdown method releases
- ThreadLocal held by NLS
- GlobalBundleCache used by NLS
- Executor held by WorkQueue

Bug: 437855
Bug: 550529
Change-Id: Icfdccd63668ca90c730ee47a52a17dbd58695ada
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-06-04 23:51:59 +02:00
Thomas Wolf ed481f96b8 ApplyCommand: use context lines to determine hunk location
If a hunk does not apply at the position stated in the hunk header
try to determine its position using the old lines (context and
deleted lines).

This is still a far cry from a full git apply: it doesn't do binary
patches, it doesn't handle git's whitespace options, and it's perhaps
not the fastest on big patches. C git hashes the lines and uses these
hashes to speed up matching hunks (and to do its whitespace magic).

Bug: 562348
Change-Id: Id0796bba059d84e648769d5896f497fde0b787dd
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-06-04 22:16:12 +02:00
David Ostrovsky 0e87f70e0e Fix ProtectedMembersInFinalClass warning flagged by error prone
Running recent error prone version complaining on that code:

CharacterHead.java:22: error: [ProtectedMembersInFinalClass] Make
members of final classes package-private: <init>
	protected CharacterHead(char expectedCharacter) {
	          ^
    (see https://errorprone.info/bugpattern/ProtectedMembersInFinalClass)
  Did you mean 'CharacterHead(char expectedCharacter) {'

Bug: 562756
Change-Id: Ic46a0b07e46235592f6e63db631f583303420b73
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
2020-06-04 16:24:16 +02:00
David Pursehouse 949ee670c6 ObjectDirectory: Explicitly handle NoSuchFileException
On the first attempt to move the temp file, NoSuchFileException can
be raised if the destination folder does not exist. Instead of handling
this implicitly in the catch of IOException and then continuing to
create the destination folder and try again, explicitly catch it and
create the destination folder. If any other IOException occurs, treat
it as an unexpected error and return FAILURE.

Subsequently, on the second attempt to move the temp file, if ANY kind
of IOException occurs, also consider this an unexpected error and
return FAILURE.

In both catch blocks for IOException, add logging at ERROR level.

Change-Id: I9de9ee3d2b368be36e02ee1c0daf8e844f7e46c8
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2020-06-04 14:07:34 +09:00
David Pursehouse c2ab332e81 ObjectDirectory: Fail immediately when atomic move is not supported
If atomic move is not supported, AtomicMoveNotSupportedException will
be thrown on the first attempt to move the temp file. There is no
point attempting the move operation a second time because it will only
fail for the same reason.

Add an immediate return of FAILURE on the first occasion. Remove the
unnecessary handling of the exception in the second block.

Change-Id: I4658a8b37cfec2d7ef0217c8346e512968d0964c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2020-06-04 14:06:21 +09:00
David Ostrovsky 7861f82029 Fix InvalidInlineTag error flagged by error prone
Running recent error prone version complaining on that code:

RefDatabase.java:444: error: [InvalidInlineTag] Tag name `linkObjectId`
is unknown.
	 * Includes peeled {@linkObjectId}s. This is the inverse lookup of
	                   ^
    (see https://errorprone.info/bugpattern/InvalidInlineTag)

Bug: 562756
Change-Id: If91da51d5138fb753c0550eeeb9e3883a394123d
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
2020-06-01 22:51:37 -04:00
Matthias Sohn 8d2d683655 Decouple JSch from JGit Core
Motivation: JSch serves as 'default' implementations of the SSH
transport. If a client application does not use it then there is no need
to pull in this dependency.

Move the classes depending on JSch to an OSGi fragment extending the
org.eclipse.jgit bundle and keep them in the same package as before
since moving them to another package would break API. Defer moving them
to a separate package to the next major release.

Add a new feature org.eclipse.jgit.ssh.jsch feature to enable
installation. With that users can now decide which of the ssh client
integrations (JCraft JSch or Apache Mina SSHD) they want to install.
We will remove the JCraft JSch integration in a later step due to the
reasons discussed in bug 520927.

Bug: 553625
Change-Id: I5979c8a9dbbe878a2e8ac0fbfde7230059d74dc2
Also-by: Michael Dardis <git@md-5.net>
Signed-off-by: Michael Dardis <git@md-5.net>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
2020-06-01 01:46:59 +02:00
Matthias Sohn 77848d635b Decouple BouncyCastle from JGit Core
Motivation: BouncyCastle serves as 'default' implementation of
the GPG Signer. If a client application does not use it there is no need
to pull in this dependency, especially since BouncyCastle is a large
library.

Move the classes depending on BouncyCastle to an OSGi fragment extending
the org.eclipse.jgit bundle. They are moved to a distinct internal
package in order to avoid split packages. This doesn't break public API
since these classes were already in an internal package before this
change.

Add a new feature org.eclipse.jgit.gpg.bc to enable installation. With
that users can now decide if they want to install it.

Attempts to sign a commit if org.eclipse.jgit.gpg.bc isn't available
will result in ServiceUnavailableException being thrown.

Bug: 559106
Change-Id: I42fd6c00002e17aa9a7be96ae434b538ea86ccf8
Also-by: Michael Dardis <git@md-5.net>
Signed-off-by: Michael Dardis <git@md-5.net>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
2020-06-01 01:26:22 +02:00
Thomas Wolf 0b2d41b858 Verify that the user home directory is valid
If the determination of the user home directory produces a Java File
object with an invalid path, spurious exceptions may occur at the
most inopportune moments anytime later. In the case in the linked bug
report, start-up of EGit failed, leading to numerous user-visible
problems in Eclipse.

So validate the return value of FS.userHomeImpl(). If converting that
File to a Path throws an exception, log the problem and fall back to
Java system property user.home. If that also is not valid, use null.

(A null user home directory is allowed by FS, and calling in Java
new File(null, "some_string") is fine and produces a File relative
to the current working directory.)

Bug: 563739
Change-Id: If9eec0f9a31a45bd815231706285c71b09f8cf56
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-05-31 12:47:21 -04:00
Thomas Wolf 089eacb273 WindowCache: conditional JMX setup
Make it possible to programmatically suppress the JMX bean
registration. In EGit it is not needed but can be rather costly
because it occurs during plug-in activation and accesses the
git user config.

Bug: 563740
Change-Id: I07ef7ae2f0208d177d2a03862846a8efe0191956
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-05-29 23:05:46 +02:00
Christian Halstrick c6213ad33a Merge "RawTextComparator.WS_IGNORE_CHANGE must not compare whitespace" 2020-05-28 08:07:02 -04:00
Thomas Wolf 6f17f9ed3f RawTextComparator.WS_IGNORE_CHANGE must not compare whitespace
Only the presence or absence of whitespace is significant; but not the
actual whitespace characters. Don't compare whitespace bytes.

Compare the C git implementation at [1].

[1] https://github.com/git/git/blob/0d0e1e8/xdiff/xutils.c#L173

Bug: 563570
Change-Id: I2d0522b637ba6b5c8b911b3376a9df5daa9d4c27
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-05-28 12:06:57 +02:00
Yunjie Li 06a90fdf2e Revert "PackBitmapIndex: Not buffer inflated bitmap in BasePackBitmapIndex"
This reverts commit 3aee92478c, which
increased fetch latency significantly.

Change-Id: Id31a94dff83bf7ab2121718ead819bd08306a0b6
Signed-off-by: Yunjie Li <yunjieli@google.com>
2020-05-27 10:31:54 -07:00
Thomas Wolf 3a499606b1 Builder API to configure SshdSessionFactories
A builder API provides a more convenient way to define a customized
SshdSessionFactory by hiding the subclassing.

Also provide a new interface SshConfigStore to abstract away the
specifics of reading a ssh config file, and provide a way to customize
the concrete ssh config implementation to be used. This facilitates
using an alternate ssh config implementation that may or may not be
based on files.

Change-Id: Ib9038e8ff2a4eb3a9ce7b3554d1450befec8e1e1
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-05-23 16:46:22 +02:00
Thomas Wolf bdb7357228 TransportHttp: abort on time-out or on SocketException
Avoid trying other authentication methods on SocketException or on
InterruptedIOException. SocketException is rather fatal, such as
nothing listening on the peer's port, connection reset, or it could
be a connection time-out.

Time-outs enforced by Timeout{Input,Output}Stream may result in
InterruptedIOException being thrown.

In both cases, it makes no sense to try other authentication methods,
and doing so may wrongly report "authentication not supported" or
"cannot open git-upload-pack" or some such instead of reporting a
time-out.

Bug: 563138
Change-Id: I0191b1e784c2471035e550205abd06ec9934fd00
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-05-23 11:06:10 +02:00
Thomas Wolf 3dbd1f2fe7 Ignore core.eol if core.autocrlf=input
Config core.eol is to be ignored if core.autocrlf is true or input.[1]
JGit didn't do so when core.autocrlf=input was set.

[1] https://git-scm.com/docs/git-config#Documentation/git-config.txt-coreeol

Bug: 561877
Change-Id: I5e62e0510d160b5113c1090319af09c2bc1bcb59
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-05-22 17:09:23 -04:00
Thomas Wolf 3c34e0acbf Attributes: fix handling of text=auto in combination with eol
In Git 2.10.0 the interpretation of gitattributes changed or was fixed
such that "* text=auto eol=crlf" would indeed still do auto-detection
of text vs. binary content.[1] Previously this was identical to
"* text eol=crlf", i.e., treating all files as text.

JGit still did the latter, which caused surprises because it changed
binary files.

[1] https://github.com/git/git/blob/master/Documentation/RelNotes/2.10.0.txt#L248

Bug: 561341
Change-Id: I5b6fb97b5e86fd950a98537b6b8574f768ae30e5
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-05-22 17:08:52 -04:00
Thomas Wolf 4d7a16257f Include full IssuerFingerprint in GPG signature
Update dependency to Bouncy Castle to 1.65.

Add the IssuerFingerprint as a hashed sub-packet in the signature. If
added unhashed, GPG ignores it.

Bug: 553206
Change-Id: I6807e8e2385e6ec5790f388e4753a44aa9474ebb
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-05-18 23:25:58 +02:00
Matthias Sohn d0f010dd26 Suppress API error for new method BitmapIndex.Bitmap#retrieveCompressed
OSGi semantic versioning allows breaking implementers in a minor
release.

Change-Id: Ib55dc43dd3b50b0ef39a7094190f230210aee4b6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-05-17 23:11:31 +02:00
Matthias Sohn 91188a7d82 Fix wrong @since tags added in dcb0265
This change was introduced in 5.8.

Change-Id: Ic74ebff5a0547bb55e0401b38f73ebc6e67cace9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-05-17 23:08:27 +02:00
Terry Parker 55b0203c31 Merge changes I39783eee,I874503ec,Ic942a8e4,I6ec2c3e8,I62cb5030, ...
* changes:
  PackBitmapIndex: Set distance threshold
  PackBitmapIndex: Not buffer inflated bitmap in BasePackBitmapIndex
  PackBitmapIndex: Remove convertedBitmaps in the Remapper
  PackBitmapIndex: Reduce memory usage in GC
  PackBitmapIndex: Add AddToBitmapWithCacheFilter class
  PackBitmapIndex: Add util methods and builder to BitmapCommit
  PackBitmapIndex: Move BitmapCommit to a top-level class
  Refactor: Make retriveCompressed an method of the Bitmap class
2020-05-13 16:34:23 -04:00
Yunjie Li 913234e2ec PackBitmapIndex: Set distance threshold
Setting the distance threshold to 2000 in PackWriterBitmapPreparer to
reduce memory usage in garbage collection. When the threshold is 0, GC
for the msm repository would use about 37 GB memory to complete. After
setting it to 2000, GC can finish in 75 min with about 10 GB memory.

Change-Id: I39783eeecbae58261c883735499e61ee1cac75fe
Signed-off-by: Yunjie Li <yunjieli@google.com>
2020-05-12 17:32:15 -07:00
Yunjie Li 3aee92478c PackBitmapIndex: Not buffer inflated bitmap in BasePackBitmapIndex
Currently we're buffering the inflated bitmap entry in BasePackBitmapIndex
to optimize running time. However, this will use lots of memory during
the construction of the pack bitmap index file which may cause failure of
garbage collection.

The running time didn't increase significantly, if there's any increase,
after removing the buffering here. The report about usage of time/memory
will come in the next commit.

Change-Id: I874503ecc85714acab7ca62a6a7968c2dc0b56b3
Signed-off-by: Yunjie Li <yunjieli@google.com>
2020-05-12 17:32:15 -07:00
Yunjie Li e250482c7a PackBitmapIndex: Remove convertedBitmaps in the Remapper
The convertedBitmaps serves for time-optimization purpose. But it's
actually not saving time much but using lots of memory. So remove the
field here to save memory.

Currently the remapper class is only used in the construction of the
bitmap index file. And during the preparation of the file, we're only
getting bitmaps from the remapper when finding objects accessible from
a commit, so bitmap associated with each commit will only be fetched once
and thus the convertedBitmaps would hardly be read, which means that it's
not saving time.

Change-Id: Ic942a8e485135fb177ec21d09282d08ca6646fdb
Signed-off-by: Yunjie Li <yunjieli@google.com>
2020-05-12 17:32:15 -07:00
Yunjie Li dcb0265436 PackBitmapIndex: Reduce memory usage in GC
Currently, the garbage collection is consistently failing for some large
repositories in the building bitmap phase, e.g.Linux-MSM project:
https://source.codeaurora.org/quic/la/kernel/msm-3.18

Historically, bitmap index creation happened in 3 phases:
1. Select the commits to which bitmaps should be attached.
2. Create all bitmaps for these commits, stored in uncompressed format
in the PackBitmapIndexBuilder.
3. Deltify the bitmaps and write them to disk.

We investigated the process. For phase 2 it's most efficient to create
bitmaps starting with oldest commit and moving to the newest commit,
because the newer commits are able to reuse the work for the old ones.
But for bitmap deltification in phase 3, it's better when a newer
commit's bitmap is the base, and the current disk format writes bitmaps
out for the newest commits first.

This change introduces a new collection to hold the deltified and
compressed representations of the bitmaps, keeping a smaller subset of
commits in the PackBitmapIndexBuilder to help make the bitmap index
creation more memory efficient.

And in this commit, we're setting DISTANCE_THRESHOLD to 0 in the
PackWriterBitmapPreparer, which means the garbage collection will not
have much behavoir change and will still use as much memory as before.

Change-Id: I6ec2c3e8dde11805af47874d67d33cf1ef83660e
Signed-off-by: Yunjie Li <yunjieli@google.com>
2020-05-12 17:32:15 -07:00