Commit Graph

4898 Commits

Author SHA1 Message Date
Alexa Panfil 4f3161d3cc Fix bug in PerformanceLogContext
PerformanceLogContext threw NullPointerException when multiple threads
tried to add an event to the PerformanceLogContext. The cause for this
is that the ThreadLocal was initialized only in the class constructor
for the first thread; for subsequent threads it was null.

To fix this initialize eventRecords for each thread.

Change-Id: I18ef67dff8f0488e3ad28c9bbc18ce73d5168cf9
Signed-off-by: Alexa Panfil <alexapizza@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-11-06 19:20:08 -04:00
Nail Samatov d76088bca6 Fix IOException occurring during gc
Fix IOException occurring when calling
GC on a repository with absent objects/pack folder.

Change-Id: I5be1333a0726f4d7491afd25ddac85451686c30a
Signed-off-by: Nail Samatov <sanail@yandex.ru>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-11-05 01:31:14 +01:00
Thomas Wolf d69fb4d4ac Revert "Client-side protocol V2 support for fetching"
This reverts commit f802f06e7f.

I had misunderstood how protocol V2 works. This implementation only
works if the negotiation during fetch is done in one round.

Fixing this is substantial work in BasePackFetchConnection. Basically
I think I'd have to change back negotiate to the V0 version, and have
a doFetch() that does

  if protocol V2
    doFetchV2()
  else
    doFetchV0()

with doFetchV0 the old code, and doFetchV2 completely new.

Plus there would need to be a HTTP test case requiring several
negotiation rounds.

This is a couple of days work at least, and I don't know when I will
have the time to revisit this. So although the rest of the code is
fine I prefer to back this out completely and not leave a only half
working implementation in the code for an indeterminate time.

Bug: 553083
Change-Id: Icbbbb09882b3b83f9897deac4a06d5f8dc99d84e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-11-03 23:50:21 +01:00
Thomas Wolf f802f06e7f Client-side protocol V2 support for fetching
Make all transports request protocol V2 when fetching. Depending on
the transport, set the GIT_PROTOCOL environment variable (file and
ssh), pass the Git-Protocol header (http), or set the hidden
"\0version=2\0" (git anon). We'll fall back to V0 if the server
doesn't reply with a version 2 answer.

A user can control which protocol the client requests via the git
config protocol.version; if not set, JGit requests protocol V2 for
fetching. Pushing always uses protocol V0 still.

In the API, there is only a new Transport.openFetch() version that
takes a collection of RefSpecs plus additional patterns to construct
the Ref prefixes for the "ls-refs" command in protocol V2. If none
are given, the server will still advertise all refs, even in protocol
V2.

BasePackConnection.readAdvertisedRefs() handles falling back to
protocol V0. It newly returns true if V0 was used and the advertised
refs were read, and false if V2 is used and an explicit "ls-refs" is
needed. (This can't be done transparently inside readAdvertisedRefs()
because a "stateless RPC" transport like TransportHttp may need to
open a new connection for writing.)

BasePackFetchConnection implements the changes needed for the protocol
V2 "fetch" command (simplified ACK handling, delimiters, section
headers).

In TransportHttp, change readSmartHeaders() to also recognize the
"version 2" packet line as a valid smart server indication.

Adapt tests, and run all the HTTP tests not only with both HTTP
connection factories (JDK and Apache HttpClient) but also with both
protocol V0 and V2. Do the same for the SSH transport tests.

Bug: 553083
Change-Id: Ice9866aa78020f5ca8f397cde84dc224bf5d41b4
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-10-29 00:36:21 +01:00
John Dallaway f4b4dae2be Ensure .gitmodules is loaded when accessing submodule name
This problem occurred when calling SubmoduleWalk#getModuleName if the
first submodule processed has a name and a path which do not match
SubmoduleWalk#getModuleName returned the module path instead of the
module name. In order to fix this SubmoduleWalk#getModuleName needs to
ensure that the modules config is loaded.

Bug: 565776
Change-Id: I36ce1fbc64c4849f9d8e39864b825c6e28d344f8
Signed-off-by: John Dallaway <john@dallaway.org.uk>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-10-25 22:24:17 +01:00
Matthias Sohn e23bdf1c7c Ensure GC.deleteOrphans() can delete read-only orphaned files on Windows
Reported in https://www.eclipse.org/lists/jgit-dev/msg04005.html

Change-Id: I140a49a7f65a76aa2b67ec8d286a3d2506ae499a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-10-22 09:25:55 +02:00
Terry Parker dd593205a0 Merge "Add new performance logging" 2020-10-21 11:53:35 -04:00
Alexa Panfil a09e205176 Add new performance logging
Add new performance logging to register events of type duration.
The proposed logging is similar to the performance logging
in OS Gerrit https://gerrit-review.googlesource.com/c/gerrit/+/225628:
a global Singleton (LoggingContext in Gerrit) is
collecting the performance logs in a thread-safe events list,
and at the end of the monitored command the list of events is
retrieved and written to a log, after which it is cleared.

What this patch does:
The main component is the Singleton (PerformanceLogContext), which
is used to collect the records (PerformanceLogRecord) in one place
(ThreadLocal eventsList) from anywhere using
PerformanceLogContext.getInstance().addEvent().

Reason why this change is needed:
The current monitoring in JGit has several issues:
1. git fetch and git push events are handled separately
(PackStatistics and ReceivedPackStatistics), with no unified way
of writing or reading the statistics.
2. PostUploadHook is only invoked on the event of sending the
pack, which means that the PackStatistics is not available for
the fetch requests that did not end with sending the pack
(negotiation requests).
3. The way the logs are created is different from the performance
log approach, so the long-running operations need to be collected
from both performance log (for JGit DFS overridden operations and
Gerrit operations) and gitlog (for JGit ones).

The proposed performance logging is going to solve the above
mentioned issues: it collects all of the performance logs in one
place, thus accounting for the commands that do not result in
sending a pack. The logs are compatible with the ones on Gerrit.
Moreover, the Singleton is accessible anywhere in the call stack,
which proved to be successful in other projects like Dapper
(https://research.google/pubs/pub36356/).

Signed-off-by: Alexa Panfil <alexapizza@google.com>
Change-Id: Iabfe667a3412d8a9db94aabb0f39b57f43469c41
2020-10-21 12:54:30 +00:00
Jonathan Tan de914b7caf Merge "Compute time differences with Duration" 2020-10-19 12:25:57 -04:00
Christian Halstrick 88e924e86b Merge "Implement git describe --all" 2020-10-13 08:14:14 -04:00
Jason Yeo 276fcb2a11 Implement git describe --all
This enables jgit to use any refs in the refs/ namespace when describing
commits.

Signed-off-by: Jason Yeo <jasonyeo88@gmail.com>
Change-Id: I1fa22d1c39c0e2f5e4c2938c9751d8556494ac26
2020-10-13 18:06:39 +08:00
Alexa Panfil a0d3680f49 Compute time differences with Duration
Reason why this change is needed:
Currently the durations of fetch events are computed by
registering time instants with System.currentTimeMillis()
and calculating the differences with simple minus operation,
but multiple sources suggest that the best practice is to use
the Java 8 Duration and Instant objects.

What this patch does:
Get time measurements with Instant.now() instead of
System.currentTimeMillis() and calculate the duration of fetch
events (Reachability checks and Negotiation) using
Duration.between().toMillis().

Signed-off-by: Alexa Panfil <alexapizza@google.com>
Change-Id: I573a0a0562070083cf5a5a196d9710f69a7fa587
2020-10-09 18:55:20 +00:00
Thomas Wolf f37aa182e1 Override config http.userAgent from environment GIT_HTTP_USER_AGENT
According to [1], environment variable GIT_HTTP_USER_AGENT can
override a git config http.userAgent.

[1] https://git-scm.com/docs/git-config#Documentation/git-config.txt-httpuserAgent

Change-Id: I996789dc49faf96339cd7b4e0a682b9bcafb6f70
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-10-06 19:03:36 -04:00
David Ostrovsky f3bb71206c Fix OperatorPrecedence warning flagged by error prone
Building with Bazel is failing with this error message:

HttpSupport.java:480: error: [OperatorPrecedence] Use grouping
parenthesis to make the operator precedence explicit
    if (c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z') {
                             ^
    (see https://errorprone.info/bugpattern/OperatorPrecedence)
Did you mean 'if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) {'?

Change-Id: I96089d3158d06ba981cfdf2b03865261b328de23
2020-10-02 18:52:10 +02:00
Matthias Sohn 7fbc0e02a0 ObjectDirectory#selectObjectRepresentation: fix formatting
Change-Id: I3872f3001bb11e29a526ed90184ccb0c991b8567
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-10-02 00:55:04 +02:00
James Wynn 2171f868d9 Support "http.userAgent" and "http.extraHeader" from the git config
Validate the extra headers and log but otherwise ignore invalid
headers. An empty http.extraHeader starts the list afresh.

The http.userAgent is restricted to printable 7-bit ASCII, other
characters are replaced by '.'.

Moves a support method from the ssh.apache bundle to HttpSupport in
the main JGit bundle.

Bug:541500
Change-Id: Id2d8df12914e2cdbd936ff00dc824d8f871bd580
Signed-off-by: James Wynn <james@jameswynn.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-09-26 23:42:38 +02:00
Thomas Wolf 6f41458d8a Add TypedConfigGetter.getPath()
This enables EGit to override with a lenient variant that logs the
problem and continues with the default value. EGit needs this because
otherwise a corrupt core.excludesFile entry can render the whole UI
unusable (until the git config is fixed) because any use of the
WorkingTreeIterator will throw an InvalidPathException.

This is not a problem on OS X, where all characters are allowed in
file names. But on Windows some characters are forbidden... see bug
567296. The message of the InvalidPathException is not helpful since
it doesn't point to the origin of the problem. EGit can log a much
better message indicating the offending config file and entry in the
Eclipse error log, where the user can see it.

Bug: 567309
Change-Id: I4e57afa715ff3aaa52cd04b5733f69e53af5b1e0
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-09-25 18:40:10 -04:00
Alexa Panfil 597a6bc7ec Make Javadoc consistent for PackStatistics fields
Change-Id: I53d4072174e9fec92bff7c0225aa7470d8d39d05
2020-09-24 19:41:55 +00:00
Alexa Panfil 3c5e159eae Measure time taken for reachability checks
Reason why this change is needed:
Getting this metric will help estimate how much time will be saved once
the reachability checks get optimized

What this patch does:
Measure time spent by requestValidator.checkWants() in parseWants() and save
it in an instance of PackStatistics.Accumulator.

Signed-off-by: Alexa Panfil <alexapizza@google.com>
Change-Id: Id7fe4016f96549d9511a2c24052dad93cfbb31a4
2020-09-24 12:32:02 +00:00
Terry Parker be403859c1 Merge "Measure time taken for negotiation in protocol V2" 2020-09-22 11:34:24 -04:00
Alexa Panfil c3458a6e03 Measure time taken for negotiation in protocol V2
Reason why this change is needed:
Getting this metric will help estimate how much time is spent
on negotiation in fetch V2.

What this patch does:
Measure time spent on negotiation rounds in UploadPack.fetchV2()
and save it in an instance of PackStatistics.Accumulator.
This is the same way the statistics are already gathered for
protocol V0/V1.

Change-Id: I14e55dd6ff743cb0b21b4953b533269ef069abb1
2020-09-22 10:05:03 +00:00
Thomas Wolf cb553e3583 IndexDiffFilter: handle path prefixes correctly
When comparing git directory paths to check whether one is a prefix
of another, one must add a slash to avoid false prefix matches when
one directory name is a prefix of another. The path "audio" is not
a prefix of the path "audio-new", but would be a prefix of a path
"audio/new".

Bug: 566799
Change-Id: I6f671ca043c7c2c6044eb05a71dc8cca8d0ee040
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-09-21 16:52:43 -04:00
Thomas Wolf 566e49d7d3 sshd: support the ProxyJump ssh config
This is useful to access git repositories behind a bastion server
(jump host).

Add a constant for the config; rewrite the whole connection initiation
to parse the value and (recursively) set up the chain of hops. Add
tests for a single hop and two different ways to configure a two-hop
chain.

The connection timeout applies to each hop in the chain individually.

Change-Id: Idd25af95aa2ec5367404587e4e530b0663c03665
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-09-19 15:17:00 -04:00
Terry Parker 292919b12a Merge "ReceivePackStats: Add size and count of unnecessary pushed objects" 2020-09-14 09:53:43 -04:00
Yunjie Li 58e991b5de ReceivePackStats: Add size and count of unnecessary pushed objects
Since there is no negotiation for a push, the client is probably sending
redundant objects and bytes which already exist in the server.

Add more metrics in the stats to quantify it. Duplicated size and number
to measure the size and the number of duplicated objects which should
not be pushed.

Change-Id: Iaacd4761ee9366a0a7ec4e26c508eff45c8744de
Signed-off-by: Yunjie Li <yunjieli@google.com>
2020-09-11 16:19:57 -07:00
Matthias Sohn 94aab2f634 Add missing since tag on BundleWriter#addObjectsAsIs
Change-Id: Ia07c0ede31e280831869db77b70b43874d8ba3ac
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-09-05 23:25:02 +02:00
Masaya Suzuki 9d2055152c jgit: Add DfsBundleWriter
DfsBundleWriter writes out the entire repository to a Git bundle file.
It packs all objects included in the packfile by concatenating all pack
files. This makes the bundle creation fast and cheap. Useful for backing
up a repository as-is.

Change-Id: Iee20e4b1ab45b2a178dde8c72093c0dd83f04805
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
2020-09-03 22:58:37 +00:00
Terry Parker 957419610a Merge changes from topic "fix_ui"
* changes:
  ResolveMerger: do not content-merge gitlinks on del/mod conflicts
  ResolveMerger: Adding test cases for GITLINK deletion
  ResolveMerger: choose OURS on gitlink when ignoreConflicts
  ResolveMerger: improving content merge readability
  ResolveMerger: extracting createGitLinksMergeResult method
  ResolveMerger: Adding test cases for GITLINK merge
2020-09-03 18:35:24 -04:00
Matthias Sohn 50e81a52c0 [errorprone] DirCacheEntry: make clear operator precedence
See https://errorprone.info/bugpattern/OperatorPrecedence

Change-Id: I96f2c844a7b3d9d5605122118d9420bde31ec2f0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-09-01 17:51:45 +02:00
Matthias Sohn 1d95f9cb8f [errorprone] PackWriter#parallelDeltaSearch: avoid suppressed exception
See https://errorprone.info/bugpattern/Finally

Change-Id: Ic2ad0d1e1ba7552b5a5c6238f83c0e13a94254d0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-09-01 17:51:45 +02:00
Matthias Sohn 04622016ea [errorprone] Declare DirCache#version final
Enums values should be immutable, see
https://errorprone.info/bugpattern/ImmutableEnumChecker.

Change-Id: Ib0a358d3a5f1560ca73ec3153ca8088fe7a35eb6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-09-01 17:51:44 +02:00
Demetr Starshov 214c4afc2c ResolveMerger: do not content-merge gitlinks on del/mod conflicts
Previously ResolveMerger tried to make a fulltext merge entry in case
one of sides got deleted regardless of file mode. This is not
applicable for GITLINK type of entry. After this change it is
rendering appropriate merge result.

Signed-off-by: Demetr Starshov <dstarshov@google.com>
Change-Id: Ibdb4557bf8781bdb48bcee6529e37dc80582ed7e
2020-08-26 18:40:28 -07:00
Demetr Starshov c084729f79 ResolveMerger: choose OURS on gitlink when ignoreConflicts
Option ignoreConflicts is used when a caller want to create a virtual
commit and use it in a future merge (recursive merge) or show it on
UI (e.g. Gerrit). According to contract in case of ignoreConflicts
ResolveMerger should populate only stage 0 for entries with merge
conflicts as if there is no conflict. Current implementation breaks
this contract for cases when gitlink revision is ambiguous.

Therefore, always select 'ours' when we merge in ignoreConflicts mode.
This will satisfy the contract contract, so recursive merge can
succeed, however it is an arbitrary decision, so it is not guaranteed
to select best GITLINK in all cases.

GITLINK merging is a special case of recursive merge because of
limitations of GITLINK type of entry. It can't contain more than 1 sha-1
so jgit can't write merge conflicts in place like it can with a blob.
Ideally we could signal the conflict with a special value (like
'0000...'), but that must be supported by all tooling (git fsck, c-git)."

Signed-off-by: Demetr Starshov <dstarshov@google.com>
Change-Id: Id4e9bebc8e828f7a1ef9f83259159137df477d89
2020-08-26 18:39:48 -07:00
Demetr Starshov 29e998a0e6 ResolveMerger: improving content merge readability
Separate "GITLINK conflict" and "attributes can't be content merged"
cases.

Signed-off-by: Demetr Starshov <dstarshov@google.com>
Change-Id: I29424e13ea1738af750196e7bf4315256a6095b6
2020-08-26 18:39:48 -07:00
Demetr Starshov 3da7ea50a9 ResolveMerger: extracting createGitLinksMergeResult method
Signed-off-by: Demetr Starshov <dstarshov@google.com>
Change-Id: Ibc8b954266b1b4b9b9f404e3433f0d7cdae107e8
2020-08-26 18:39:48 -07:00
Marc Strapetz 0220f32e5a Fix possible NegativeArraySizeException in PackIndexV1
Due to an integer overflow bug, the current "Index file is too large
for jgit" check did not work properly and subsequently a
NegativeArraySizeException was raised.

Change-Id: I2736efb28987c29e56bc946563b7fa781898a94a
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
2020-08-25 12:42:53 -04:00
Thomas Wolf 2990ad66ad FS: use binary search to determine filesystem timestamp resolution
Previous code used a minimum granularity of 1 microsecond and would
iterate 233 times on a system where the resolution is 1 second (for
instance, Java 8 on Mac APFS).

New code uses a binary search between the maximum we care about (2
seconds) and zero, with a minimum granularity of also 1 microsecond.
This takes at most 19 iterations (guaranteed). For a file system with 1
second resolution, it takes 4 iterations (1s, 0.5s, 0.8s, 0.9s). With
an up-front check at 1 microsecond and at 1 millisecond this performs
equally well as the old code on file systems with a fine resolution.
(For instance, Java 11 on Mac APFS.)

Also handle obscure cases where the file timestamp implementation may
yield bogus values (as observed on HP NonStop). If such an error case
occurs, log a warning and abort the measurement at the last good value.

Bug: 565707
Change-Id: I82a96729b50c284be7c23fbdf3d0df1bddf60e41
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-23 11:39:24 +02:00
Matthias Sohn e9c7ba6fdc Do not prematurely create directory of jgit's XDG config file
LockFile.lock() will create it anyway when the config file is created.

Bug: 565637
Change-Id: I078b89a695193fd76f130f6de7ac1cf26d2f8f0f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-22 22:57:14 +02:00
Thomas Wolf b6ca1993b0 FS: write to JGit config in a background thread
Otherwise locking problems may ensue if the JGit config itself is
on a different file system. Since the internal is already updated,
it is not really important when exactly the value gets persisted.
By queueing up separate Runnables executed by a single thread we
avoid concurrent write access to the JGit config, and nested calls
to getFileStoreAttributes(Path) result in serialized attempts to
write.

The thread for writing the config must not be a daemon thread. If
it were, JVM shutdown might kill it anytime, which may lead to
the config not being written, or worse, a config.lock file being
left behind.

Bug: 566170
Change-Id: I07e3d4c5e029d3cec9ab5895002fc4e0c7948c40
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-21 09:00:06 +02:00
Thomas Wolf eec9b55dcf FS: don't cache fallback if running in background
If the background job is a little late, the true result might
arrive and be cached later. So make sure we don't cache the large
fallback resolution in the per-directory cache. Otherwise we'd work
with the large fallback until the next restart.

Bug: 566170
Change-Id: I7354a6cfddfc0c05144bb0aa41c23029bd4f6af0
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-20 22:17:40 +02:00
Thomas Wolf efd1cc05af Keep line endings for text files committed with CR/LF on text=auto
Git never converts line endings if the version in the repository is a
text file with CR/LF and text=auto. See [1]: "When the file has been
committed with CRLF, no conversion is done."

Because the sentence just before is about converting line endings on
check-in, I had understood that in commit 60cf85a [2] to mean that no
conversion on check-in was to be done. However, as bug 565048 and a
code inspection of the C git code showed it really means no conversion
is done on check-in *or check-out*.

If the text attribute is not set but core.autocrlf = true, this is
the same as text=auto eol=crlf. C git does not convert on check-out
even on text=auto eol=lf if the index version is a text file with
CR/LF.

For check-in, one has to look at the intended target, which is done
in WorkingTreeIterator since commit 60cf85a. For check-out, it can
be done by looking at the source and can thus be done in the
AutoLFOutputStream.

Additionally, provide a constructor for AutoLFInputStream to do
the same; for cases where the equivalent of a check-out is done via
an input stream obtained from a blob. (EGit does that in its
GitBlobStorage for the Eclipse compare framework; it's more efficient
than using a TemporaryBuffer and DirCacheCheckout.getContent(), and
it avoids the need for a temporary file.)

Adapt existing tests, and add new checkout and merge tests to verify
the resulting files have the correct line endings.

EGit's GitBlobStorage will need to call the new version of
EolStreamTypeUtil.wrapInputStream().

[1] https://git-scm.com/docs/gitattributes#Documentation/gitattributes.txt-Settostringvalueauto
[2] https://git.eclipse.org/r/c/jgit/jgit/+/127324

Bug: 565048
Change-Id: If1282ef43e2abd00263541bd10a01fe1f5c619fc
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-17 08:52:55 +02:00
Thomas Wolf 71aeedb6ec Delay WindowCache statistics JMX MBean registration
The WindowCache is configured statically with a default
WindowCacheConfig. The default config says (for backwards
compatibility reasons) to publish the MBean. As a result,
the MBean always gets published.

By delaying the MBean registration until the first call to
getInstance() or get(PackFile, long) we can avoid the forced
registration and do it only if not re-configured in the meantime
not to publish the bean. (As is done by Egit, to avoid a very
early costly access to the user and system config during plug-in
activation.)

Bug: 563740
Change-Id: I8a941342c0833acee2107515e64299aada7e0520
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-16 14:41:21 +02:00
Thomas Wolf e9cb0a8e47 DirCache: support index V4
Index format version 4 was introduced in C git in 2012. It's about
time that JGit can deal with it.

Version 4 added prefix path compression. Instead of writing the full
path for each index entry to disk, only the difference to the previous
entry's path is written: a variable-encoded int telling how many bytes
to remove from the previous entry's path to get the common prefix,
followed by the new suffix.

Also, cache entries in a version 4 index are not padded anymore.

Internally, version 3 and version 4 index entries are identical; it's
only the stored format that changes.

Implement this path compression, and make sure we write an index file
that we read previously in the same format. (Only changing from version
2 to version 3 if there are extended flags.)

Add support for the "feature.manyFiles" and the "index.version" git
configs, and honor them when writing a new index file.

Add tests, including a compatibility test that verifies that JGit can
read a version 4 index generated by C git and write an identical
version 4 index.

Bug: 565774
Change-Id: Id83241cf009e50f950eb42f8d56b834fb47da1ed
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-15 12:47:45 +02:00
Thomas Wolf 72b111ecd7 Update javadoc for RemoteSession and SshSessionFactory
The timeout on RemoteSession.exec() cannot be a timeout for the
whole command. It can only be a timeout for setting up the process;
after that it's the application's responsibility to implement some
timeout for the execution of the command, for instance by calling
Process.waitFor(int, TimeUnit) or through other means.

Sessions returned by an SshSessionFactory are already connected and
authenticated -- they must be, because RemoteSession offers no
operations for connecting or authenticating a session.

Change the implementation of SshdExecProcess.waitFor() to wait
indefinitely. The original implementation used the timeout from
RemoteSession.exec() because of that erroneous javadoc.

Change-Id: I3c7ede24ab66d4c81f72d178ce5012d383cd826e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-10 22:51:34 +02:00
Thomas Wolf 24fdc1d039 Fix JSchProcess.waitFor() with time-out
SshSupport.runSshCommand() had a comment that wait with time-out
could not be used because JSchProcess.exitValue() threw the wrong
unchecked exception when the process was still running.

Fix this and make JSchProcess.exitValue() throw the right exception,
then wait with a time-out in SshSupport.

The Apache sshd client's SshdExecProcess has always used the correct
IllegalThreadStateException.

Add tests for SshSupport.runCommand().

Change-Id: Id30893174ae8be3b9a16119674049337b0cf4381
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-08-10 22:51:34 +02:00
Jonathan Nieder 86aa6deff4 FilterSpec: Use BigInteger.ZERO instead of valueOf(0)
This just simplifies a bit by avoiding an unneeded method call.

Change-Id: I6d8d2fc512d8f8a82da73c355017d0abf833a13b
2020-07-31 19:01:35 -07:00
Jonathan Nieder 3c807e0158 Do not send empty blob in response to blob:none filter
If I create a repository containing an empty file and clone it
with

	git clone --no-checkout --filter=blob:none \
		https://url/of/repository

then I would expect no blobs to be transferred over the wire.  Alas,
JGit rewrites filter=blob:none to filter=blob:limit=0, so if the
repository contains an empty file then the empty blob gets
transferred.

Fix it by teaching JGit about filters based on object type to
complement the existing filters based on object size.  This prepares
us for other future filters such as object:none.

In particular, this means we do not need to look up the size of the
filtered blobs, which should speed up clones.  Noticed by Anna
Pologova and Terry Parker.

Change-Id: Id4b234921a190c108d8be2c87f54dcbfa811602a
Signed-off-by: Jonathan Nieder <jrn@google.com>
2020-07-29 21:04:20 -07:00
Jonathan Nieder dceedbcd6e Add support for tree filters when fetching
Teach the FilterSpec serialization code about tree filters so they can
be communicated over the wire and understood by the server.

While we're here, harden the FilterSpec serialization code to throw
IllegalStateException if we encounter a FilterSpec that cannot be
expressed as a "filter" line.  The only public API for creating a
Filterspec is to pass in a "filter" line to be parsed, so these should
not appear in practice.

Change-Id: I9664844059ffbc9c36eb829e2d860f198b9403a0
Signed-off-by: Jonathan Nieder <jrn@google.com>
2020-07-29 20:52:12 -07:00
Thomas Wolf 9fe5406119 FS_POSIX: avoid prompt to install the XCode tools on OS X
OS X ships with a default /usr/bin/git that is just a wrapper that
at run-time delegates to the selected XCode toolchain, and that
prompts the user to install the XCode command line tools if not
already installed.

This is annoying for people who don't want to do so, since they'll
be prompted on each Eclipse start. Also, since on OS X the $PATH for
applications started via the GUI is not the same as the $PATH as set
via the shell profile, just using /usr/bin/git (which will normally
be found when JGit runs inside Eclipse) may give slightly surprising
results if the user has installed a non-Apple git and changed his
$PATH in the shell such that the non-Apple git is used in the shell.
(For instance by placing /usr/local/bin earlier on the path.) Eclipse
and the shell will use different git executables, and thus different
git system configs.

Therefore, try to find git via bash --login -c 'which git' not only
if we couldn't find it on $PATH but also if we found the default git
/usr/bin/git. If that finds some other git, use that. If the bash
approach also finds /usr/bin/git, double check via xcode-select -p
that an XCode git is present. If not, assume there is no git installed,
and work without any system config.

Bug: 564372
Change-Id: Ie9d010ebd9437a491ba5d92b4ffd1860c203f8ca
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2020-07-26 15:38:48 -04:00
Matthias Sohn 097f01bfb6 Use LinkedBlockingQueue for executor determining filesystem attributes
Using a fixed thread pool with unbounded LinkedBlockingQueue fixes the
RejectedExecutionException thrown if too many threads try to
concurrently determine filesystem attributes.

Comparing that to an alternative implementation using an unbounded
thread pool instead showed similar performance with the reproducer (in
range of 100-1000 threads in reproducer) on my mac:

threads   time

fixed threadpool up to 5 threads with LinkedBlockingQueue of unlimited
queue size
100       1103 ms
200       1602 ms
300       2369 ms
500       4002 ms
1000      11071 ms

unbounded cached threadpool
100       1108 ms
200       1591 ms
300       2299 ms
500       4577 ms
1000      11196 ms

Bug: 564202
Change-Id: I773da7414a1dca8e548349442dca9b56643be946
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2020-07-24 00:31:03 +02:00