Commit Graph

4916 Commits

Author SHA1 Message Date
Shawn Pearce 0bff481d45 Limit receive commands
Place a configurable upper bound on the amount of command data
received from clients during `git push`.  The limit is applied to the
encoded wire protocol format, not the JGit in-memory representation.
This allows clients to flexibly use the limit; shorter reference names
allow for more commands, longer reference names permit fewer commands
per batch.

Based on data gathered from many repositories at $DAY_JOB, the average
reference name is well under 200 bytes when encoded in UTF-8 (the wire
encoding).  The new 3 MiB default receive.maxCommandBytes allows about
11,155 references in a single `git push` invocation.  A Gerrit Code
Review system with six-digit change numbers could still encode 29,399
references in the 3 MiB maxCommandBytes limit.

Change-Id: I84317d396d25ab1b46820e43ae2b73943646032c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-11 00:20:36 +01:00
David Pursehouse 67da5635a4 Update minimum JDK version in README
Change-Id: I655d896b268e946e3492661b08add0ebac22c6f0
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-10 18:47:39 -04:00
David Pursehouse 32f3cbd44f Upgrade jacoco-maven-plugin to 0.7.9
Change-Id: Ifc0b0f2899f4094f0525021236c3e73658138f7e
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-10 22:45:41 +01:00
David Pursehouse d12f1288b7 Upgrade maven-build-helper-plugin to 3.0.0
Change-Id: Ib354bf3a1c064f54255dc05de9e89e79dbcc9182
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-10 22:45:40 +01:00
David Pursehouse b2ec099aad Upgrade maven-shade-plugin to 3.0.0
Change-Id: I46bf48657eceefc65b710a054df14dea5b9f15f8
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-10 22:45:39 +01:00
David Pursehouse 3a74663934 Fix inconsistent versioning of findbugs-maven-plugin
In one place version 3.0.4 is used, and in another place 3.0.3 is
used.

Define the version (3.0.4) in a property and use that in both places,
so it doesn't get inconsistent again next time the version is bumped.

Change-Id: If3a2489cec78c0c9ef76aa6b941fda51b098e04b
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-10 22:45:38 +01:00
David Pursehouse f4cb12206e Upgrade maven-compiler-plugin to 3.6.1
Change-Id: Ia1c21c17ed6cd17c7ee353aa6a0bf1f88de5c045
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-10 22:45:38 +01:00
David Pursehouse a9a3ce92aa LocalDiskRepositoryTestCase: Add clarifying comment in call to createRepository
Clarify that 'true' means 'auto close'. This makes it consistent with
other calls that have a boolean argument for 'bare'. It also makes it a
bit easier to see what's going on while stepping in the debugger, because
it's not necessary to scroll around to find the method declaration.

Change-Id: Idacd749407dcfd258af3efaaf44d129069925dd3
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-10 21:04:03 +09:00
David Pursehouse 158d3722d3 IndexDiffSubmoduleTest: Fix negative use count
submoduleStandalone is created by createWorkRepository() which adds
the created repository to the set of repositories to be closed in
the test teardown. It is therefore not necessary to explicitly close
it.

Change-Id: Ib6f525b644fdeaaf1934df39cc2d3583a0d883dc
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-10 19:48:52 +09:00
David Pursehouse 1834421a7f BlameGenerator: Annotate #getRenameDetector as Nullable
The renameDetector member returned by this method will be null when
following file renames has been disabled by previously calling:

  setFollowFileRenames(false).

Annotate it as @Nullable and update the Javadoc to explicitly
document the null return.

Change-Id: I9bdf443a64cf3c45352d3ab023051a2e11f7426d
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-09 22:40:56 +01:00
David Pursehouse 51239129b3 FetchCommandTest: Don't declare specific exceptions in test methods
Change-Id: Ie0f8a0f7a9c2c383be6ae8265353daac7f5a89fa
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-09 15:10:15 +09:00
David Pursehouse 16dc88fe10 PushCommandTest: Remove unused variables to prevent errors in Eclipse
Change-Id: Ie656b18fb151bf1e3c2dcc0438a77e32102991c2
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-09 15:10:15 +09:00
David Pursehouse d9d8c507a4 RefLeaseSpec: Fix Eclipse errors
- Remove unused import

- Remove unused private constructor

- Add Javadoc for public constructor

Change-Id: I1253e9fe863ca0f63182461ee87357fbf726ea2e
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-09 15:10:15 +09:00
Shawn Pearce 8fce17a995 Merge "push: support per-ref force-with-lease" 2017-02-08 22:27:06 -05:00
David Turner 46d35a8502 push: support per-ref force-with-lease
When rebasing, force-pushing has a race condition: someone else might
have pushed a commit since the one you just rewrote. The force-with-lease
option prevents this by ensuring that the ref's old value is the one
that you expected.

Change-Id: I97ca9f8395396c76332bdd07c486e60549ca4401
Signed-off-by: David Turner <dturner@twosigma.com>
2017-02-08 19:42:33 -05:00
Shawn Pearce 6450d956bc Assume GC_REST and GC_TXN also attempted deltas during packing
In a DFS repository the DfsGarbageCollector will typically attempt
delta compression while creating the three main pack files: GC,
GC_REST and GC_TXN. Include all of these in the wasDeltaAttempted()
decision so that future packers can bypass delta compression of
non-delta objects.

Change-Id: Ic2330c69fab0c494b920b4df0a290f3c2e1a03d7
2017-02-08 15:34:00 -08:00
Shawn Pearce d67b183537 Prefer smaller GC files during DFS garbage collection
In 8ac65d33ed PackWriter changed its
behavior to always prefer the last object representation presented
to it by the ObjectReuseAsIs implementation. This was a fix to avoid
delta chain cycles.

Unfortunately it can lead to suboptimal compression when concurrent
GCs are run on the same repository. One case is automatic GC running
(with default settings) in parallel to a manual GC that has disabled
delta reuse in order to generate new smaller deltas for the entire
history of the repository.

Running GC with no-reuse generally requires more CPU time, which
also translates to a longer running time.  This can lead to a race
where the automatic GC completes before the no-reuse GC, leaving
the repository in a state such as:

  no-reuse GC:   size 1 GiB, mtime = 18:45
  auto GC:       size 8 GiB, mtime = 17:30

With the default sort ordering, the smaller no-reuse GC pack is
sorted earlier in the pack list, due to its more recent mtime.

During object reuse in a future GC, these smaller representations
are considered first by PackWriter, but are all discarded when the
auto GC file from 17:30 is examined second (due to its older mtime).

Work around this in two ways.

Well formed DFS repositories should have at most 1 GC pack. If
2 or more GC packs exist, break the sorting tie by selecting the
smaller file earlier in the pack list. This allows all normal read
code paths to favor the smaller file, which places less pressure
on the DfsBlockCache. If any GC race happens, readers serving clone
requests will prefer the file that is smaller.

During object reuse, flip this ordering so that the smaller file is
last. This allows PackWriter to see smaller deltas last, replacing
larger representations that were previously considered from other
pack files.

Change-Id: I0b7dc8bb9711c82abd6bd16643f518cfccc6d31a
2017-02-08 14:37:12 -08:00
Shawn Pearce 61d4922928 Fix missing deltas near type boundaries
Delta search was discarding discovered deltas if an object appeared
near a type boundary in the delta search window. This has caused JGit
to produce larger pack files than other implementations of the packing
algorithm.

Delta search works by pushing prior objects into a search window, an
ordered list of objects to attempt to delta compress the next object
against. (The window size is bounded, avoiding O(N^2) behavior.)

For implementation reasons multiple object types can appear in the
input list, and the window. PackWriter commonly passes both trees and
blobs in the input list handed to the DeltaWindow algorithm. The pack
file format requires an object to only delta compress against the same
type, so the DeltaWindow algorithm must stop doing comparisions if a
blob would be compared to a tree.

Because the input list is sorted by object type and the window is
recently considered prior objects, once a wrong type is discovered in
the window the search algorithm stops and uses the current result.

Unfortunately the termination condition was discarding any found
delta by setting deltaBase and deltaBuf to null when it was trying
to break the window search.

When this bug occurs, the state of the DeltaWindow looks like this:

                                 current
                                  |
                                 \ /
  input list:  tree0 tree1 blob1 blob2

  window:      blob1 tree1 tree0
                / \
                 |
              res.prev

As the loop iterates to the right across the window, it first finds
that blob1 is a suitable delta base for blob2, and temporarily holds
this in the bestDelta/deltaBuf fields. It then considers tree1, but
tree1 has the wrong type (blob != tree), so the window loop must give
up and fall through the remaining code.

Moving the condition up and discarding the window contents allows
the bestDelta/deltaBuf to be kept, letting the final file delta
compress blob1 against blob0.

The impact of this bug (and its fix) on real world repositories is
likely minimal. The boundary from blob to tree happens approximately
once in the search, as the input list is sorted by type. Only the
first window size worth of blobs (e.g. 10 or 250) were failing to
produce a delta in the final file.

This bug fix does produce significantly different results for small
test repositories created in the unit test suite, such as when a pack
may contains 6 objects (2 commits, 2 trees, 2 blobs).  Packing test
cases can now better sample different output pack file sizes depending
on delta compression and object reuse flags in PackConfig.

Change-Id: Ibec09398d0305d4dbc0c66fce1daaf38eb71148f
2017-02-08 14:36:24 -08:00
Shawn Pearce 12c8462602 Merge "Reintroduce garbage pack coalescing when ttl > 0." 2017-02-08 00:23:40 -05:00
Thirumala Reddy Mutchukota 006f4d4d29 Reintroduce garbage pack coalescing when ttl > 0.
Disabling the garbage pack coalescing when garbageTtl > 0 can result in
lot of garbage packs if they are created within the garbageTtl time.

To avoid a large number of garbage packs, re-introducing garbage pack
coalescing for the packs that are created within a single calendar day
when the garbageTtl is more than one day or one third of the garbageTtl.

Change-Id: If969716aeb55fb4fd0ff71d75f41a07638cd5a69
Signed-off-by: Thirumala Reddy Mutchukota <thirumala@google.com>
2017-02-07 20:34:31 -08:00
David Pursehouse 5336a07386 Merge "Branch normalizer should not normalize already valid branch names" 2017-02-07 07:31:06 -05:00
Matthias Sohn 08480c948c [infer] Fix ObjectWalk leak in PackWriter.preparePack()
Change-Id: I5d2455404e507faa717e9d916e9b6cd80aa91473
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-07 00:50:09 +01:00
Matthias Sohn f8d232213c Branch normalizer should not normalize already valid branch names
Change-Id: Ib746655e32a37c4ad323f1d12ac0817de8fa56cf
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-07 00:24:39 +01:00
Bo Zhang d4bd09b78d Follow redirects in transport
Bug: 465167
Change-Id: I6da19c8106201c2a1ac69002bd633b7387f25d96
Signed-off-by: Bo Zhang <zhangbodut@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-02 21:20:23 -04:00
Matthias Sohn 566794d001 Merge branch 'stable-4.6'
* stable-4.6:
  GC: delete empty directories after purging loose objects
  GC.prune(Set<ObjectId>): return early if objects directory is empty

Change-Id: I3d6cacf80d3b4c69ba108e970855963bd9f6ee78
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-02 23:36:28 +01:00
Matthias Sohn 18cda3888c GC: delete empty directories after purging loose objects
In order to limit the number of directories we check for emptiness only
consider fanout directories which contained unreferenced loose objects
we deleted in the same gc run.

Change-Id: Idf8d512867ee1c8ed40bd55752122ce83a98ffa2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-01 23:44:07 +01:00
David Pursehouse b20f7d610e Organize imports
Change-Id: I97044f69d220fc2d3f9fe890fdfec542454f02d2
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-01 14:31:44 +09:00
Hongkai Liu a33663fd4e Detect stale-file-handle error in causal chain
Cover the case where the exception is wrapped up as a
cause, e.g., PackIndex#open(File).

Change-Id: I0df5b1e9c2ff886bdd84dee3658b6a50866699d1
Signed-off-by: Hongkai Liu <hongkai.liu@ericsson.com>
2017-01-30 22:36:59 -04:00
David Pursehouse 62411453f1 Merge branch 'stable-4.6'
* stable-4.6:
  Clean up orphan files in GC

Change-Id: I4fb6b4cd03d032535a9c04ede784bea880b4536b
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-31 09:31:10 +09:00
David Pursehouse 25ab5b4d9b Merge "Don't rely on default locale when using toUpperCase() and toLowerCase()" 2017-01-30 07:32:32 -05:00
David Pursehouse c64412e1f9 Merge FileTreeIteratorJava7Test into FileTreeIteratorTest
JGit now requires Java 8, so it is no longer necessary to have a
separate class for Java 7 specific tests. Remove it and merge its
tests into the existing FileTreeIteratorTest.

FileTreeIteratorTest has an @Before annotated method that sets up
some files in the git, which breaks the tests which have assumptions
on the file names. Add adjustments.

Change-Id: I14f88d8e079e1677c8dfbc1fcbf4444ea8265365
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-30 13:23:52 +01:00
Hongkai Liu b198f77c1e Rename FileUtilTest to FileUtilsTest and merge in FileUtils7Test
Rename the test class to match the name of the class under test.

JGit now requires Java 8 so it is no longer necessary to have a
separate class (FileUtils7Test) for Java 7 tests. Merge those into
FileUtilsTest.

Change-Id: I39dd7e76a2e4ce97319c7d52261b0a1546879788
Signed-off-by: Hongkai Liu <hongkai.liu@ericsson.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-30 02:42:56 -04:00
Hector Caballero 27b710c394 Make GC cancellable when called programmatically
Sometimes, it is necessary to cancel a garbage collection operation.
When GC is called using the standalone executable, i.e., from a command
line, Control-Cing the process does the trick. When calling GC
programmatically, though, there is no mechanism to do it.

Add checks in the GC process so that a custom cancellable progress
monitor could be passed in order to cancel the operation at specific
points. In this case, the calling process set the cancel flag in the
progress monitor and the GC process will throw an exception that can
be caught and handled by the caller accordingly.

Change-Id: Ieaecf3dbdf244539ec734939c065735f6785aacf
Signed-off-by: Hector Caballero <hector.caballero@ericsson.com>
2017-01-29 20:14:37 -04:00
Matthias Sohn a11bb03127 GC.prune(Set<ObjectId>): return early if objects directory is empty
Change-Id: Id56b102604c4e0437230e3e7c59c0a3a1b676256
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-30 00:55:38 +01:00
Hongkai Liu 8fd500e20c Clean up orphan files in GC
An orphan file is either a bitmap or an idx file in pack folder,
and its corresponding pack file is missing.

Change-Id: I3c4cb1f7aa99dd7b398bdb8d513f528d7761edff
Signed-off-by: Hongkai Liu <hongkai.liu@ericsson.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-30 00:55:36 +01:00
Matthias Sohn 53ad437382 RepositoryCacheTest: avoid to close already closed repository
The tearDown() of the superclass closed the repository once more which
led to a negative use count warning logged by Repository.close().

Change-Id: I331f85a540c68264a53456276c32f72b79113d61
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-28 21:19:55 +01:00
David Pursehouse ac6353e9e5 RefUpdateTest: Don't call createBareRepository in try-with-resource
createBareRepository adds the created repo to the list of repos to be
closed in the superclass's teardown. Wrapping it in try-with-resource
causes it to be closed too many times, resulting in a corrupt use
count.

Change-Id: I4c70630bf6008544324dda453deb141f4f89472c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-28 19:16:48 +01:00
David Pursehouse 25b14084c9 ReceivePackAdvertiseRefsHookTest: Don't close repositories in teardown
The repositories get added to the "toClose" set by createBareRepository,
and are then closed in the superclass's tearDown method.

Explicitly closing them in this test class's teardown causes the use
count to go negative when subsequently closed again by the superclass.

Change-Id: Idcbb16b4cf4bf0640d7e4ac15d1926d8a27c1251
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-28 17:46:28 +01:00
David Pursehouse acc94c475a RepoCommand#readFile: Don't call Git#getRepository() in try-with-resource
Using try-with-resource means that close() will automatically be
called on the Repository object. However, according to the javadoc
of Git#close():

  If the repository was opened by a static factory method in this class,
  then this method calls Repository#close() on the underlying repository
  instance.

This means that Repository#close() is called twice, by Git.close()
and in the outer try-with-resource, leading to a corrupt use count.

Change-Id: I37ba517eb2cc67d1cd36813598772c70208d0bc9
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-28 17:46:28 +01:00
David Pursehouse 883856110c RepoCommandTest: Don't wrap create{Bare,Work}Directory in t-w-r
These methods add the created Repository into "toClose", and they are
then closed by LocalDiskRepositoryTestCase's tearDown method.

Calling them in try-with-resource causes them to first be closed in
the test method, and then again in tearDown, which results in the use
count going negative and a log message on the console.

While this is not a serious problem, having so many false positives
in the logs will potentially drown out real cases of Repository being
closed too many times.

Change-Id: Ib374445e101dc11cb840957b8b19ee1caf777392
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-28 17:46:27 +01:00
David Pursehouse 5d0ca36d39 LocalDiskRepositoryTestCase: Only add to toClose through access method
Only using the access method means we only have one place where the
toClose set is modified, making it easier to debug either by adding
log statements or by setting a breakpoint.

Change-Id: I4f9f1774d5f2e10bcab381edfd84bb6ee0499a11
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-28 17:46:27 +01:00
David Pursehouse a9a3af4af7 GitConstructionTest: Remove unnecessary calls to Repository.close()
The repositories are already closed in the superclass teardown due
to being added to the "toClose" set.

Change-Id: I768ba8a02fc585907687caf37e2e283434016c04
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-28 17:46:27 +01:00
Matthias Sohn 7dca5dc9f3 Merge "LocalDiskRepositoryTestCase: Prevent duplicates in list of repos to close" 2017-01-28 11:45:21 -05:00
Andrey Loskutov 2845016777 Merge "Remove unused org.apache.http.impl.client.cache requirement" 2017-01-28 10:04:30 -05:00
Matthias Sohn a4feeb0194 Don't rely on default locale when using toUpperCase() and toLowerCase()
Otherwise these methods may produce unexpected results if used for
strings that are intended to be interpreted locale independently.
Examples are programming language identifiers, protocol keys, and HTML
tags. For instance, "TITLE".toLowerCase() in a Turkish locale returns
"t\u0131tle", where '\u0131' is the LATIN SMALL LETTER DOTLESS I
character.

See
https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#toLowerCase--
http://blog.thetaphi.de/2012/07/default-locales-default-charsets-and.html

Bug: 511238
Change-Id: Id8d8f37d84d62239c918b81f8d883ed798d87656
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-28 15:06:15 +01:00
David Pursehouse d69a829bdb LocalDiskRepositoryTestCase: Prevent duplicates in list of repos to close
Change the "toClose" list to a set, which will not allow duplicate
entries. This reduces the number of false positive logs about corrupt
use count due to the same repository being closed more than once during
teardown.

Change-Id: I5ab0ff8b56e7f2b2c7aab5274d957708d26f42c5
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-27 16:00:06 +09:00
David Pursehouse 2eb1bebd60 Repository: Include repository name when logging corrupt use count
Logging the repository name makes it easier to track down what is
incorrectly closing a repository.

Change-Id: I42a8bdf766c0e67f100adbf76d9616584e367ac2
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-27 15:59:09 +09:00
Thirumala Reddy Mutchukota c9f55032a2 Record the estimated size of the pack files.
The Compacter and Garbage Collector will record the estimated size of
the newly going to be created compact, gc or garbage packs. This
information can be used by the clients to better make a call on how to
actually store the pack based on the approximated expected size.

Added a new protected method DfsObjDatabase.newPack(PackSource
packSource, long estimatedPackSize), so that the clients can override
this method to make use of the estimatedPackSize while creating a new
PackDescription object. The default implementation of this method is
equivalent to
newPack(packSource).setEstimatedPackSize(estimatedPackSize). I didn't
make it abstract because that would force all the existing sub classes
of DfsObjDatabase to implement this method. Due to this default
implementation, the estimatedPackSize is added to DfsPackDescription
using a setter instead of a constructor parameter (even though
constructor parameter would be a better choice as this value is set only
during the object creation).

Change-Id: Iade1122633ea774c2e842178a6a6cbb4a57b598b
Signed-off-by: Thirumala Reddy Mutchukota <thirumala@google.com>
2017-01-26 12:01:59 -08:00
Andrey Loskutov 7856867664 Remove unused org.apache.http.impl.client.cache requirement
The package is not used by the plugin and seems to be missing in the
platform anyway under some conditions, see bug 508321 (newer
org.apache.httpcomponents.httpclient_4.5.2 does NOT include the package,
org.apache.httpcomponents.httpclient_4.3.6 does).

Change-Id: Ida5d926a611812b5177af651b3cf22f1b2519e02
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
2017-01-26 15:30:36 -04:00
David Pursehouse 635bff2d50 Merge branch 'stable-4.6'
* stable-4.6:
  Update Orbit to S20170120205402 and com.jcraft.jsch to 0.1.54
  Fix preparation of 4.6.1-SNAPSHOT builds

Change-Id: Ifd1f81cb199a0b5bd35e8652cac116e377136b2d
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-26 11:01:32 +09:00