Commit Graph

3630 Commits

Author SHA1 Message Date
Zhen Chen ff852dad51 Skip first pack if avoid garbage is set and it is a garbage pack
At beginning of the OBJECT_SCAN loop, it will first check if the object
exists in the last pack, however, it forgot to avoid garbage pack for
the first iteration.

Change-Id: I8a99c0f439218d19c49cd4dae891b8cc4a57099d
Signed-off-by: Zhen Chen <czhen@google.com>
2017-02-13 20:54:35 -04:00
Zhen Chen 8dd5b644dc Refactor skip garbage pack logic into a method
There are multiple places in DfsReader to skip garbage pack if both of
the following conditions satisfied:

* AvoidUnreachable flag is set
* The pack is a garabge pack

Refactor them into a shared private method.

Change-Id: I67d6bb601db55f904437c807c6a3c36f0a723265
Signed-off-by: Zhen Chen <czhen@google.com>
2017-02-13 15:33:23 -08:00
Shawn Pearce 0bff481d45 Limit receive commands
Place a configurable upper bound on the amount of command data
received from clients during `git push`.  The limit is applied to the
encoded wire protocol format, not the JGit in-memory representation.
This allows clients to flexibly use the limit; shorter reference names
allow for more commands, longer reference names permit fewer commands
per batch.

Based on data gathered from many repositories at $DAY_JOB, the average
reference name is well under 200 bytes when encoded in UTF-8 (the wire
encoding).  The new 3 MiB default receive.maxCommandBytes allows about
11,155 references in a single `git push` invocation.  A Gerrit Code
Review system with six-digit change numbers could still encode 29,399
references in the 3 MiB maxCommandBytes limit.

Change-Id: I84317d396d25ab1b46820e43ae2b73943646032c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-11 00:20:36 +01:00
David Pursehouse 1834421a7f BlameGenerator: Annotate #getRenameDetector as Nullable
The renameDetector member returned by this method will be null when
following file renames has been disabled by previously calling:

  setFollowFileRenames(false).

Annotate it as @Nullable and update the Javadoc to explicitly
document the null return.

Change-Id: I9bdf443a64cf3c45352d3ab023051a2e11f7426d
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-09 22:40:56 +01:00
David Pursehouse d9d8c507a4 RefLeaseSpec: Fix Eclipse errors
- Remove unused import

- Remove unused private constructor

- Add Javadoc for public constructor

Change-Id: I1253e9fe863ca0f63182461ee87357fbf726ea2e
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-09 15:10:15 +09:00
Shawn Pearce 8fce17a995 Merge "push: support per-ref force-with-lease" 2017-02-08 22:27:06 -05:00
David Turner 46d35a8502 push: support per-ref force-with-lease
When rebasing, force-pushing has a race condition: someone else might
have pushed a commit since the one you just rewrote. The force-with-lease
option prevents this by ensuring that the ref's old value is the one
that you expected.

Change-Id: I97ca9f8395396c76332bdd07c486e60549ca4401
Signed-off-by: David Turner <dturner@twosigma.com>
2017-02-08 19:42:33 -05:00
Shawn Pearce 6450d956bc Assume GC_REST and GC_TXN also attempted deltas during packing
In a DFS repository the DfsGarbageCollector will typically attempt
delta compression while creating the three main pack files: GC,
GC_REST and GC_TXN. Include all of these in the wasDeltaAttempted()
decision so that future packers can bypass delta compression of
non-delta objects.

Change-Id: Ic2330c69fab0c494b920b4df0a290f3c2e1a03d7
2017-02-08 15:34:00 -08:00
Shawn Pearce d67b183537 Prefer smaller GC files during DFS garbage collection
In 8ac65d33ed PackWriter changed its
behavior to always prefer the last object representation presented
to it by the ObjectReuseAsIs implementation. This was a fix to avoid
delta chain cycles.

Unfortunately it can lead to suboptimal compression when concurrent
GCs are run on the same repository. One case is automatic GC running
(with default settings) in parallel to a manual GC that has disabled
delta reuse in order to generate new smaller deltas for the entire
history of the repository.

Running GC with no-reuse generally requires more CPU time, which
also translates to a longer running time.  This can lead to a race
where the automatic GC completes before the no-reuse GC, leaving
the repository in a state such as:

  no-reuse GC:   size 1 GiB, mtime = 18:45
  auto GC:       size 8 GiB, mtime = 17:30

With the default sort ordering, the smaller no-reuse GC pack is
sorted earlier in the pack list, due to its more recent mtime.

During object reuse in a future GC, these smaller representations
are considered first by PackWriter, but are all discarded when the
auto GC file from 17:30 is examined second (due to its older mtime).

Work around this in two ways.

Well formed DFS repositories should have at most 1 GC pack. If
2 or more GC packs exist, break the sorting tie by selecting the
smaller file earlier in the pack list. This allows all normal read
code paths to favor the smaller file, which places less pressure
on the DfsBlockCache. If any GC race happens, readers serving clone
requests will prefer the file that is smaller.

During object reuse, flip this ordering so that the smaller file is
last. This allows PackWriter to see smaller deltas last, replacing
larger representations that were previously considered from other
pack files.

Change-Id: I0b7dc8bb9711c82abd6bd16643f518cfccc6d31a
2017-02-08 14:37:12 -08:00
Shawn Pearce 61d4922928 Fix missing deltas near type boundaries
Delta search was discarding discovered deltas if an object appeared
near a type boundary in the delta search window. This has caused JGit
to produce larger pack files than other implementations of the packing
algorithm.

Delta search works by pushing prior objects into a search window, an
ordered list of objects to attempt to delta compress the next object
against. (The window size is bounded, avoiding O(N^2) behavior.)

For implementation reasons multiple object types can appear in the
input list, and the window. PackWriter commonly passes both trees and
blobs in the input list handed to the DeltaWindow algorithm. The pack
file format requires an object to only delta compress against the same
type, so the DeltaWindow algorithm must stop doing comparisions if a
blob would be compared to a tree.

Because the input list is sorted by object type and the window is
recently considered prior objects, once a wrong type is discovered in
the window the search algorithm stops and uses the current result.

Unfortunately the termination condition was discarding any found
delta by setting deltaBase and deltaBuf to null when it was trying
to break the window search.

When this bug occurs, the state of the DeltaWindow looks like this:

                                 current
                                  |
                                 \ /
  input list:  tree0 tree1 blob1 blob2

  window:      blob1 tree1 tree0
                / \
                 |
              res.prev

As the loop iterates to the right across the window, it first finds
that blob1 is a suitable delta base for blob2, and temporarily holds
this in the bestDelta/deltaBuf fields. It then considers tree1, but
tree1 has the wrong type (blob != tree), so the window loop must give
up and fall through the remaining code.

Moving the condition up and discarding the window contents allows
the bestDelta/deltaBuf to be kept, letting the final file delta
compress blob1 against blob0.

The impact of this bug (and its fix) on real world repositories is
likely minimal. The boundary from blob to tree happens approximately
once in the search, as the input list is sorted by type. Only the
first window size worth of blobs (e.g. 10 or 250) were failing to
produce a delta in the final file.

This bug fix does produce significantly different results for small
test repositories created in the unit test suite, such as when a pack
may contains 6 objects (2 commits, 2 trees, 2 blobs).  Packing test
cases can now better sample different output pack file sizes depending
on delta compression and object reuse flags in PackConfig.

Change-Id: Ibec09398d0305d4dbc0c66fce1daaf38eb71148f
2017-02-08 14:36:24 -08:00
Shawn Pearce 12c8462602 Merge "Reintroduce garbage pack coalescing when ttl > 0." 2017-02-08 00:23:40 -05:00
Thirumala Reddy Mutchukota 006f4d4d29 Reintroduce garbage pack coalescing when ttl > 0.
Disabling the garbage pack coalescing when garbageTtl > 0 can result in
lot of garbage packs if they are created within the garbageTtl time.

To avoid a large number of garbage packs, re-introducing garbage pack
coalescing for the packs that are created within a single calendar day
when the garbageTtl is more than one day or one third of the garbageTtl.

Change-Id: If969716aeb55fb4fd0ff71d75f41a07638cd5a69
Signed-off-by: Thirumala Reddy Mutchukota <thirumala@google.com>
2017-02-07 20:34:31 -08:00
David Pursehouse 5336a07386 Merge "Branch normalizer should not normalize already valid branch names" 2017-02-07 07:31:06 -05:00
Matthias Sohn 08480c948c [infer] Fix ObjectWalk leak in PackWriter.preparePack()
Change-Id: I5d2455404e507faa717e9d916e9b6cd80aa91473
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-07 00:50:09 +01:00
Matthias Sohn f8d232213c Branch normalizer should not normalize already valid branch names
Change-Id: Ib746655e32a37c4ad323f1d12ac0817de8fa56cf
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-07 00:24:39 +01:00
Bo Zhang d4bd09b78d Follow redirects in transport
Bug: 465167
Change-Id: I6da19c8106201c2a1ac69002bd633b7387f25d96
Signed-off-by: Bo Zhang <zhangbodut@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-02 21:20:23 -04:00
Matthias Sohn 566794d001 Merge branch 'stable-4.6'
* stable-4.6:
  GC: delete empty directories after purging loose objects
  GC.prune(Set<ObjectId>): return early if objects directory is empty

Change-Id: I3d6cacf80d3b4c69ba108e970855963bd9f6ee78
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-02 23:36:28 +01:00
Matthias Sohn 18cda3888c GC: delete empty directories after purging loose objects
In order to limit the number of directories we check for emptiness only
consider fanout directories which contained unreferenced loose objects
we deleted in the same gc run.

Change-Id: Idf8d512867ee1c8ed40bd55752122ce83a98ffa2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-02-01 23:44:07 +01:00
David Pursehouse b20f7d610e Organize imports
Change-Id: I97044f69d220fc2d3f9fe890fdfec542454f02d2
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-02-01 14:31:44 +09:00
Hongkai Liu a33663fd4e Detect stale-file-handle error in causal chain
Cover the case where the exception is wrapped up as a
cause, e.g., PackIndex#open(File).

Change-Id: I0df5b1e9c2ff886bdd84dee3658b6a50866699d1
Signed-off-by: Hongkai Liu <hongkai.liu@ericsson.com>
2017-01-30 22:36:59 -04:00
David Pursehouse 62411453f1 Merge branch 'stable-4.6'
* stable-4.6:
  Clean up orphan files in GC

Change-Id: I4fb6b4cd03d032535a9c04ede784bea880b4536b
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-31 09:31:10 +09:00
David Pursehouse 25ab5b4d9b Merge "Don't rely on default locale when using toUpperCase() and toLowerCase()" 2017-01-30 07:32:32 -05:00
Hector Caballero 27b710c394 Make GC cancellable when called programmatically
Sometimes, it is necessary to cancel a garbage collection operation.
When GC is called using the standalone executable, i.e., from a command
line, Control-Cing the process does the trick. When calling GC
programmatically, though, there is no mechanism to do it.

Add checks in the GC process so that a custom cancellable progress
monitor could be passed in order to cancel the operation at specific
points. In this case, the calling process set the cancel flag in the
progress monitor and the GC process will throw an exception that can
be caught and handled by the caller accordingly.

Change-Id: Ieaecf3dbdf244539ec734939c065735f6785aacf
Signed-off-by: Hector Caballero <hector.caballero@ericsson.com>
2017-01-29 20:14:37 -04:00
Matthias Sohn a11bb03127 GC.prune(Set<ObjectId>): return early if objects directory is empty
Change-Id: Id56b102604c4e0437230e3e7c59c0a3a1b676256
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-30 00:55:38 +01:00
Hongkai Liu 8fd500e20c Clean up orphan files in GC
An orphan file is either a bitmap or an idx file in pack folder,
and its corresponding pack file is missing.

Change-Id: I3c4cb1f7aa99dd7b398bdb8d513f528d7761edff
Signed-off-by: Hongkai Liu <hongkai.liu@ericsson.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-30 00:55:36 +01:00
David Pursehouse acc94c475a RepoCommand#readFile: Don't call Git#getRepository() in try-with-resource
Using try-with-resource means that close() will automatically be
called on the Repository object. However, according to the javadoc
of Git#close():

  If the repository was opened by a static factory method in this class,
  then this method calls Repository#close() on the underlying repository
  instance.

This means that Repository#close() is called twice, by Git.close()
and in the outer try-with-resource, leading to a corrupt use count.

Change-Id: I37ba517eb2cc67d1cd36813598772c70208d0bc9
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-28 17:46:28 +01:00
Matthias Sohn a4feeb0194 Don't rely on default locale when using toUpperCase() and toLowerCase()
Otherwise these methods may produce unexpected results if used for
strings that are intended to be interpreted locale independently.
Examples are programming language identifiers, protocol keys, and HTML
tags. For instance, "TITLE".toLowerCase() in a Turkish locale returns
"t\u0131tle", where '\u0131' is the LATIN SMALL LETTER DOTLESS I
character.

See
https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#toLowerCase--
http://blog.thetaphi.de/2012/07/default-locales-default-charsets-and.html

Bug: 511238
Change-Id: Id8d8f37d84d62239c918b81f8d883ed798d87656
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-28 15:06:15 +01:00
David Pursehouse 2eb1bebd60 Repository: Include repository name when logging corrupt use count
Logging the repository name makes it easier to track down what is
incorrectly closing a repository.

Change-Id: I42a8bdf766c0e67f100adbf76d9616584e367ac2
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-27 15:59:09 +09:00
Thirumala Reddy Mutchukota c9f55032a2 Record the estimated size of the pack files.
The Compacter and Garbage Collector will record the estimated size of
the newly going to be created compact, gc or garbage packs. This
information can be used by the clients to better make a call on how to
actually store the pack based on the approximated expected size.

Added a new protected method DfsObjDatabase.newPack(PackSource
packSource, long estimatedPackSize), so that the clients can override
this method to make use of the estimatedPackSize while creating a new
PackDescription object. The default implementation of this method is
equivalent to
newPack(packSource).setEstimatedPackSize(estimatedPackSize). I didn't
make it abstract because that would force all the existing sub classes
of DfsObjDatabase to implement this method. Due to this default
implementation, the estimatedPackSize is added to DfsPackDescription
using a setter instead of a constructor parameter (even though
constructor parameter would be a better choice as this value is set only
during the object creation).

Change-Id: Iade1122633ea774c2e842178a6a6cbb4a57b598b
Signed-off-by: Thirumala Reddy Mutchukota <thirumala@google.com>
2017-01-26 12:01:59 -08:00
Lars Vogel 71edc8bd6f Fixes Javadoc error in org.eclipse.jgit created with I59539ac
Adds the param information to the private method. These are generated
via tooltip to resolve the compile errors.

Bug: 511043
Change-Id: I9ba551978eab750326d1a067b296e3ae93925871
Signed-off-by: Lars Vogel <Lars.Vogel@vogella.com>
2017-01-25 12:40:59 -04:00
Jonathan Nieder 061d24f6d5 Remove @since tags from internal packages
These packages don't use @since tags because they are not part of the
stable public API.  Some @since tags snuck in, though.  Remove them to
make the convention easier to find for new contributors and the
expectations clearer for users.

Change-Id: I6c17d3cfc93657f1b33cf5c5708f2b1c712b0d31
2017-01-24 14:41:24 -08:00
David Turner 8bec98cec0 gc: loosen unreferenced objects
An unreferenced object might appear in a pack.  This could only happen
because it was previously referenced, and then later that reference
was removed.  When we gc, we copy the referenced objects into a new
pack, and delete the old pack.  This would remove the unreferenced
object.  Now we first create a loose object from any unreferenced
object in the doomed pack.  This kicks off the two-week grace period
for that object, after which it will be collected if it's not
referenced.

This matches the behavior of regular git.

Change-Id: I59539aca1d0d83622c41aa9bfbdd72fa868ee9fb
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
2017-01-24 14:22:45 -08:00
Matthias Sohn d3c4c0622f [infer] Mark ManifestParse.getFilteredProjects non-null
Change-Id: I05653df7a0337443d2c8e53f47f4e95ec9ca1a9c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-23 19:55:20 +01:00
Matthias Sohn b686c8468c [infer] Fix potential NPE in DiffFormatter
Change-Id: Ia33e2af9ce3393d9173ca0dc7efefd86c965d8c8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-23 19:55:18 +01:00
Matthias Sohn 423a583fcc [infer] Fix potential NPE in CloneCommand
Change-Id: Ie7eeba3ae719ff207c7535d535a9e0bd6c9e99e6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-23 19:55:12 +01:00
David Pursehouse dd5e500a57 Format Bazel files with buildifier
Change-Id: I934114315d2c7cab917f1011b8e55c52367d429f
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-22 22:34:11 +01:00
Shawn Pearce 131b09106f Change StreamGobbler to Runnable to avoid unused Future
It can be considered a programming error to create a Future<T>
but do nothing with that object. There is an async computation
happening and without holding and checking the Future for done
or exception the caller has no idea if it has completed.

FS doesn't really care about these StreamGobblers finishing.
Instead use Runnable with execute(Runnable), which doesn't
return a Future.

Change-Id: I93b66d1f6c869e66be5c1169d8edafe781e601f6
2017-01-21 09:44:14 +01:00
Matthias Sohn f503a9f5b7 Add missing @since tags on new API constants
Change-Id: Ia8b861da07fba99644ccc9eb5578a46cc39600a1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-19 23:56:25 +01:00
James Melvin 91132bb05e gc: Add options to preserve and prune old pack files
The new --preserve-oldpacks option moves old pack files into the
preserved subdirectory instead of deleting them after repacking.

The new --prune-preserved option prunes old pack files from the
preserved subdirectory after repacking, but before potentially
moving the latest old packfiles to this subdirectory.

These options are designed to prevent stale file handle exceptions
during git operations which can happen on users of NFS repos when
repacking is done on them. The strategy is to preserve old pack files
around until the next repack with the hopes that they will become
unreferenced by then and not cause any exceptions to running processes
when they are finally deleted (pruned).

Change-Id: If3f729f0d9ce920ee2c3e6acdde46f2068be61d2
Signed-off-by: James Melvin <jmelvin@codeaurora.org>
2017-01-19 11:00:18 +01:00
David Ostrovsky e92a0c3adc Implement initial framework of Bazel build
The initial implementation only builds the packages consumed by
Gerrit Code Review.

Test build and execution is not implemented.

We prefer to consume maven_jar custom rule from bazlets repository,
for the same reasons as in the Gerrit project:

* Caching artifacts across different clones and projects
* Exposing source classifiers and neverlink artifact

TEST PLAN:

  $ bazel build :all
  $ unzip -t bazel-genfiles/all.zip
  Archive: bazel-genfiles/all.zip
    testing: libjgit-archive.jar      OK
    testing: libjgit-servlet.jar      OK
    testing: libjgit.jar              OK
    testing: libjunit.jar             OK
  No errors detected in compressed data of bazel-genfiles/all.zip.

Change-Id: Ia837ce95d9829fe2515f37b7a04a71a4598672a0
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-18 19:13:16 -04:00
Wim Jongman b667c182cb Normalizer creating a valid branch name from a string
Generic normalization method for a possible invalid branch name.
The method compresses dividers between spaces, then replaces spaces
and non word characters with underscores.

This method is needed in preparation for subsequent EGit changes.

Bug: 509878
Change-Id: Ic0d12f098f90f912a45bcc5693d6accf751d4e58
Signed-off-by: Wim Jongman <wim.jongman@remainsoftware.com>
2017-01-18 22:05:28 +01:00
Christian Halstrick 8a46b60371 Merge "Fix StashApplyCommand for stashes containing untracked changes." 2017-01-16 03:45:00 -05:00
Thomas Wolf 46af7192a2 Fix StashApplyCommand for stashes containing untracked changes.
If there are untracked changes, apply only the untracked tree
after a successful merge. The merge tree from merging untracked
with HEAD would also contain files already reset before (changes
in tracked files) and try to reset those again,leading to false
checkout conflicts.

Bug: 505804
Change-Id: Iaced4d277623334d11e3d1cca5969590d7c5093e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2017-01-15 21:54:12 +01:00
Marc Strapetz 1c4b3f8c45 Fix possible InvalidObjectIdException in ObjectDirectory
ObjectDirectory.getShallowCommits should throw an IOException
instead of an InvalidArgumentException if invalid SHAs are present
in .git/shallow (as this file is usually edited by a human).

Change-Id: Ia3a39d38f7aec4282109c7698438f0795fbec905
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2017-01-15 15:05:51 +01:00
Zhen Chen d6b354f60f Skip pack header bytes in DfsPackFile
The 12 bytes `PACK...` header is written in PackWriter before reading
CachedPack files. In DfsPackFile#copyPackBypassCache, the header was not
skipped when the first block is not in cache.

Change-Id: Ibbe2e564d36b79922a936657f286addb1044d237
Signed-off-by: Zhen Chen <czhen@google.com>
2017-01-13 22:10:42 -08:00
Dariusz Luksza 0e187f1484 Add LfsPointerFilter TreeFilter
Add new variation of TreeFilter in order to detect LFS pointer files in
the repository.

Additionally, update LfsPointer to support the legacy version URL [1] as
described in [2], and to allow arbitrary fields in the pointer file.

[1] https://hawser.github.com/spec/v1
[2] https://github.com/git-lfs/git-lfs/blob/master/docs/spec.md

Change-Id: I621eb058619fb1b78888a54c4b60bb110a722fc3
Signed-off-by: Dariusz Luksza <dariusz@luksza.org>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2017-01-10 00:13:24 +01:00
Shawn Pearce db77610256 Pack refs/tags/ with refs/heads/
This fixes a nasty performance issue for repositories that have many
objects referenced through refs/tags/, but not in refs/heads/.
Situations like this can arise when a project has made releases like
refs/tags/v1.0, and then decides to orphan history and start over for
version 2. The v1.0 objects are not reachable from master anymore,
but are still live due to the v1.0 tag.

When tags are packed in the GC_OTHER pack, bitmaps are not able to
cover the repository's contents. This may cause very slow counting
times during git clone, as the server must enumerate the ancient
history under refs/tags/ to respond to the client.

Clients by default always ask for all tags when asking for all heads
during clone. This has been true since git-core commit 8434c2f1afedb
(Apr 27 2008), when clone was converted to a builtin. Including tags
in the main GC pack should still allow servers to benefit from the
fast full pack reuse path when serving a clone to a client.

Change-Id: I22e29517b5bc6fa3d6b19a19f13bef0c68afdca3
2017-01-03 14:46:41 -08:00
Marc Strapetz 6087031469 Get rid of javax.servlet API dependency for core org.eclipse.jgit
Change-Id: I57d5d4fab7e0b1bd4cf5f1850e8569c8ac5def88
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
2017-01-03 18:50:55 +01:00
Matthias Sohn 5dc30db56e [findBugs] PackWriter.NONE should be final
Change-Id: I4b5621bcb4db82e0560408b3cde6f18b0cc55b29
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-30 01:19:58 +01:00
Matthias Sohn 29ddbf7fcd [findBugs] Remove reliance on default encoding in Base64
Change-Id: I6901da975a86c460ce7c783a519669d8be8e23bb
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-29 19:50:29 +01:00
Matthias Sohn f63267134f [findBugs] Fix potential NPE in GC
Change-Id: I59cda76b2c5039e08612f394ee4f7f1788578c49
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-29 00:59:33 +01:00
James Melvin d980a3fa85 Fix keep pack filename
Previously it was looking for a keep file with the name of a pack file
(extenstion included) appended with a '.keep'. However, the keep file
name should be the pack file name  with a '.keep' extension

Change-Id: I9dc4c7c393ae20aefa0b9507df8df83610ce4d42
Signed-off-by: James Melvin <jmelvin@codeaurora.org>
2016-12-27 14:08:56 -07:00
Matthias Sohn 5fee071f6a Prepare 4.7.0-SNAPSHOT builds
Change-Id: I20754d13007e6591d36aae5766f3a9a82b24e120
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-27 01:45:50 +01:00
Matthias Sohn 3857c3168f Prepare 4.6.1-SNAPSHOT builds
Change-Id: I6b05a6f6c3f92365c272e1bdaf76093ca01f2d58
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-24 15:51:54 +01:00
Matthias Sohn 73a4325149 JGit v4.6.0.201612231935-r
Change-Id: Iaa88fe1b195dfe6be99a7b4cb064684e75563715
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-24 01:42:38 +01:00
Matthias Sohn 5274da3c3c Merge branch 'stable-4.5'
* origin/stable-4.5:
  Fix one case of missing object

Change-Id: Ia6384f4be71086d5a0a8c42c7521220f57dfd086
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-24 00:30:00 +01:00
Matthias Sohn 1fb2319c18 [infer] Fix resource leak in IndexDiff
We only need the tree id to add it to a TreeWalk so change tree's type
to AnyObjectId.

Bug: 509385
Change-Id: I98dd5fef15cd173fe1fd84273f0f48e64e12e608
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-21 23:51:50 +01:00
Matthias Sohn 325cb35ccd [infer] Fix resource leak in ObjectChecker
Bug: 509385
Change-Id: I6b6ff5b721d959eb0708003a40c8f97d6826ac46
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-21 00:50:21 +01:00
Matthias Sohn f30fe13ac9 [infer] Fix a resource leak in PackWriter
Bug: 509385
Change-Id: Ic8a82895fa39be73f1bd8427cfe9437be6fc4e3e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-19 08:38:41 +01:00
Matthias Sohn 6cbc99d3ee [infer] Fix resource leaks in DfsInserter
Bug: 509385
Change-Id: Id5dc40bb3fb9da97ea0795cca1f2bcdcde347767
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-19 00:02:43 +01:00
Matthias Sohn a498a2865e [infer] Fix resource leak in ManifestParser
Bug: 509385
Change-Id: Icfe58ac2e5344546448a55ad14ec082356be968c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-18 23:02:47 +01:00
Matthias Sohn e78626f414 [infer] Fix resource leak in RepoCommand
Bug: 509385
Change-Id: I30c427f0dd2fc1fceb6b003dfdee0a05efaefca9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-18 23:02:45 +01:00
Matthias Sohn 1779fb4a57 [infer] Fix resource leak in DirCache
Bug: 509385
Change-Id: I5f914c910ef3a7583594fb31c7757d3dddf6a05e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-18 13:49:37 +01:00
Matthias Sohn aa199ff648 [infer] Fix SubmoduleWalk leaks in submodule commands
Bug: 509385
Change-Id: I4cba81d8ea596800a40799dc9cb763fae01fe508
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-18 13:49:36 +01:00
Matthias Sohn fbcc2cb4ca [infer] Fix resource leaks in SubmoduleAddCommand
Bug: 509385
Change-Id: I9d25cf117cfb19df108f5fe281232193fd898474
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-18 13:49:33 +01:00
Matthias Sohn 82344bd7a2 [infer] Fix resource leaks in RebaseCommand
Bug: 509385
Change-Id: I9fbdfda59f7bc577aab55dc92ff897b00b5cb050
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-18 10:38:24 +01:00
Matthias Sohn 05e8cdf563 [infer] Fix resource leak in BlameCommand
Bug: 509385
Change-Id: Ic57fd3bf940752229e35102e7761823f7d3d8732
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-18 10:38:23 +01:00
Hector Oswaldo Caballero 4ddd4a3d1b Fix one case of missing object
When a repository is being GCed and a concurrent push is received, there
is the possibility of having a missing object. This is due to the fact
that after the list of objects to delete is built, there is a window of
time when an unreferenced and ready to delete object can be referenced
by the incoming push. In that case, the object would be deleted because
there is no way to know it is no longer unreferenced. This will leave
the repository in an inconsistent state and most of the operations fail
with a missing tree/object error.

Given the incoming push change the last modified date for the now
referenced object, verify this one is still a candidate to delete
before actually performing the delete operation.

Change-Id: Iadcb29b8eb24b0cb4bb9335b670443c138a60787
Signed-off-by: Hector Oswaldo Caballero <hector.caballero@ericsson.com>
2016-12-13 10:47:05 -05:00
Christian Halstrick 11d24e6844 Fix FileSnapshot.isModified
FileSnapshot.isModified may have reported a file to be clean although it
was actually dirty.

Imagine you have a FileSnapshot on file f. lastmodified and lastread are
both t0. Now time is t1 and you
1) modify the file
2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1)
3) modify the file again
4) wait 3 seconds
5) ask the Filesnapshot whether the file is dirty or not. It erroneously
answered it's clean.

Any file which has been modified longer than 2.5 seconds ago was
reported to be clean. As the test shows that's not always correct.

The real-world problem fixed by this change is the following:
* A gerrit server using JGit to serve git repositories is processing
fetch requests while simultaneously a native git garbage collection
runs on the repo.
* At time t1 native git writes temporary files in the pack folder
setting the mtime of the pack folder to t1.
* A fetch request causes JGit to search for new packfiles and JGit
remembers this scan in a Filesnapshot on the packs folder. Since the gc
is not finished JGit doesn't see any new packfiles.
* The fetch is processed and the gc ends while the filesystem timer is
still t1. GC writes a new packfile and deletes the old packfile.
* 3 seconds later another request arrives. JGit does not yet know about
the new packfile but is also not rescanning the pack folder because it
cached that the last scan happened at time t1 and pack folder's mtime is
also t1. Now JGit will not be able to resolve any object contained in
this new pack. This behavior may be persistent if objects referenced by
the ref/meta/config branch are affected so gerrit can't read permissions
stored in the refs/meta/config branch anymore and will not allow any
pushes anymore. The pack folder will not change its mtime and therefore
no rescan will take place.

Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-12-13 11:28:12 +01:00
Zhen Chen d621305588 Decide whether to "Accept-Encoding: gzip" on a request-by-request basis
When the reply is already compressed (e.g. a packfile fetched using dumb
HTTP), "Content-Encoding: gzip" wastes bandwidth relative to sending the
content raw. So don't "Accept-Encoding: gzip" for such requests.

Change-Id: Id25702c0b0ed2895df8e9790052c3417d713572c
Signed-off-by: Zhen Chen <czhen@google.com>
2016-12-09 16:24:50 -08:00
David Pursehouse 654ae82970 Replace usage of deprecated EWAHCompressedBitmap.add(long)
The add(long) method was deprecated in favor of addWord(long) in
the 0.8.3 release of JavaEWAH [1].

[1] https://github.com/lemire/javaewah/commit/e443cf5e

Change-Id: I89c397ed02e040f57663d04504399dfdc0889626
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-12-07 22:23:31 -04:00
Christian Halstrick 930cd43553 Fix merge-base calculation
Fix JGits merge-base calculation in case of inconsistent commit times.
JGit was potentially failing to compute correct merge-bases when the
commit times where inconsistent (a parent commit was younger than a
child commit). The code in MergeBaseGenerator was aware of the fact that
sometimes the discovery of a merge base x can occur after the parents of
x have been seen (see comment in #carryOntoOne()). But in the light of
inconsistent commit times it was possible that these parents of a
merge-base have already been returned as a merge-base.

This commit fixes the bug by buffering all commits generated by
MergeBaseGenerator. It is expected that this buffer will be small
because the number of merge-bases will be small. Additionally a new
flag is used to mark the ancestors of merge-bases. This allows to filter
out the unwanted commits.

Bug: 507584
Change-Id: I9cc140b784c3231b972bd2c3de61a789365237ab
2016-11-28 09:38:19 +01:00
Grace Wang fe329f5db4 Specify RevisionSyntaxException message in Repository#resolve
This does not address all cases where no message is specified, only
cases where Repository#isValidRefName returns false.

Change-Id: Ib88cdabfdcdf37be0053e06949b0e21ad87a9575
Signed-off-by: Grace Wang <gracewang92@gmail.com>
2016-11-24 03:56:01 -04:00
Matthias Sohn c6cfe500b5 Add missing @since tags for new API
Change-Id: I900d745195f58c067fadf209bb92cd3c852c59f4
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-11-23 23:29:41 +01:00
Zhen Chen 8803718493 dump HTTP: Avoid being confused by Content-Length of a gzipped stream
TransportHttp sets 'Accept-Encoding: gzip' to allow the server to
compress HTTP responses. When fetching a loose object over HTTP, it
uses the following code to read the response:

       InputStream in = openInputStream(c);
       int len = c.getContentLength();
       return new FileStream(in, len);

If the content is gzipped, openInputStream decompresses it and produces
the correct content for the object. Unfortunately the Content-Length
header contains the length of the compressed stream instead of the
actual content length. Use a length of -1 instead since we don't know
the actual length.

Loose objects are already compressed, so the gzip encoding typically
produces a longer compressed payload. The value from the Content-Length
is too high, producing EOFException: Short read of block.

Change-Id: I8d5284dad608e3abd8217823da2b365e8cd998b0
Signed-off-by: Zhen Chen <czhen@google.com>
Helped-by: Jonathan Nieder <jrn@google.com>
2016-11-22 21:21:24 -04:00
Zhen Chen 5af3f9bd63 Close input stream after use
The InputStream in FileStream in downloadPack is never closed.

Change-Id: I59975d0b8d51f4b3e3ba9d4496b254d508cb936d
Signed-off-by: Zhen Chen <czhen@google.com>
2016-11-22 12:00:42 -08:00
Shawn Pearce 81f9c18433 Define MonotonicClock interface for advanced timestamps
MonotonicClock can be implemented to provide more certainity about
time than the standard System.currentTimeMillis() can provide. This
can be used by classes such as PersonIdent and Ketch to rely on
more certainity about time moving in a strictly ascending order.

Gerrit Code Review can also leverage this interface through its
embedding of JGit and use MonotonicClock and ProposedTimestamp to
provide stronger assurance that NoteDb time is moving forward.

Change-Id: I1a3cbd49a39b150a0d49b36d572da113ca83a786
2016-11-21 11:34:14 -08:00
Dave Borowitz 5bb434e01f Update JavaEWAH to 1.1.6
Use Oxygen M3 Orbit repository which provides the bundles built using
the new orbit-recipe based build.

CQ: 11658
Change-Id: I7f3dcc966732b32830c75d5daa55383bd028d182
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-11-17 00:26:44 +01:00
Matthias Sohn 52fa09c8d4 Add missing @since tags for new API
Change-Id: Iaf83f66637d6a13e4a6d096ba8529553af7e30ed
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-11-14 16:10:40 -08:00
Shawn Pearce 2685f4b101 Fix cryptoVer constant name to CRYPTO_VER
Change-Id: I46c39f2eceb4d58e49bd6273b87711f35250ab5c
2016-11-14 15:52:43 -08:00
Shawn Pearce 3b2248c5cf RepositoryCache: simplify code
The type parameters can now be inferred when creating
ConcurrentHashMap.

A for loop over the keys of a ConcurrentHashMap doesn't
need to use an Iterator<Map.Entry>; loop syntax handles
this just fine over keySet().

Change-Id: I1f85bb81b77f7cd1caec77197f2f0bf78e4a82a1
2016-11-14 15:51:55 -08:00
Shawn Pearce ca4ef2d24b Add missing @Override found by ErrorProne
Change-Id: I585242507815aad4aa0103fd55a6c369e342ab33
2016-11-14 15:46:28 -08:00
Shawn Pearce 8208da2f59 Deprecate SafeBufferedOutputStream
Java 8 fixed the silent flush during close issue by
FilterOutputStream (base class of BufferedOutputStream)
using try-with-resources to close the stream, getting a
behavior matching what JGit's SafeBufferedOutputStream
was doing:

  try {
    flush();
  } finally {
    out.close();
  }

With Java 8 as the minimum required version to run JGit
it is no longer necessary to override close() or have
this class. Deprecate the class, and use the JRE's version
of close.

Change-Id: Ic0584c140010278dbe4062df2e71be5df9a797b3
2016-11-14 15:33:54 -08:00
David Pursehouse 3e52252622 Merge "Support {get,set}GitwebDescription on InMemoryRepository" 2016-11-14 17:54:17 -05:00
Shawn Pearce 71ea0fe567 Support {get,set}GitwebDescription on InMemoryRepository
This simplifies testing for Gerrit Code Review where
application code is updating the repository description
and the test harness uses InMemoryRepository.

Change-Id: I9fbcc028ae24d90209a862f5f4f03e46bfb71db0
2016-11-14 14:40:21 -08:00
David Pursehouse a45cfee7a3 Organize imports
Change-Id: I7c545d06b1bced678c020fab9af1382bc4416b6e
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-11-14 13:25:20 -08:00
Shawn Pearce 1c70dd6d21 Add {get,set}GitwebDescription to Repository
This method pair allows the caller to read and modify the description
file that is traditionally used by gitweb and cgit when rendering a
repository on the web.

Gerrit Code Review has offered this feature for years as part of
its GitRepositoryManager interface, but its fundamentally a feature
of JGit and its Repository abstraction.

git-core typically initializes a repository with a default value
inside the description file. During getDescription() this string
is converted to null as it is never a useful description.

Change-Id: I0a457026c74e9c73ea27e6f070d5fbaca3439be5
2016-11-14 11:14:35 -08:00
Shawn Pearce eeb0705ef3 Merge "Don't serialize internal hash collision chain link" 2016-11-14 13:27:11 -05:00
Shawn Pearce a0bac65233 Merge "Reduce synchronized scope around ConcurrentHashMap" 2016-11-14 13:25:38 -05:00
Jonathan Nieder 7b8a0a28bf Merge "StreamCopyThread: Remove unused AtomicInteger import" 2016-11-13 19:05:50 -05:00
Jonathan Nieder f21233fd0e StreamCopyThread: Remove unused AtomicInteger import
I forgot to do this in 97f3baa0d3
(StreamCopyThread: Remove unnecessary flushCount, 2016-11-13).

Change-Id: Iaed9f345848cf0f854c9d0debcf94bc831d53054
2016-11-13 16:01:16 -08:00
Matthias Sohn 707e4538c2 Merge "Extract insecure Cipher factory" 2016-11-13 19:00:27 -05:00
Jonathan Nieder 2185d84c1a Merge "Get rid of SoftReference in RepositoryCache" 2016-11-13 18:43:57 -05:00
Shawn Pearce 9df75b755f Extract insecure Cipher factory
Bazel runs ErrorProne by default and ErrorProne rightly complains that
allowing the user to specify any Cipher can lead to insecure code
(in particular, getCipher("AES") operates in ECB mode). Unfortunately
this is required to support existing repositories insecurely stored
on S3.

Extract the insecure factory code to its own class so this can be built
as a java_library() with this check disabled.

Change-Id: I34f381965bdaa25d5aa8ebf6d8d5271b238334e0
2016-11-13 19:28:45 -04:00
Jonathan Nieder 96941550de StreamCopyThread: flush cannot interrupt a write
Because flush calls interrupt with writeLock held, it cannot interrupt
a write.  Simplify by no longer defending against that.

Change-Id: Ib0b39b425335ff7b0ea1b1733562da5392576a15
2016-11-13 13:35:16 -08:00
Jonathan Nieder 97f3baa0d3 StreamCopyThread: Remove unnecessary flushCount
StreamCopyThread#run consistently interrupts itself whenever it
discovers it has been interrupted by StreamCopyThread#flush while not
reading.  The flushCount is not needed to avoid lost flushes.

All in-tree users of StreamCopyThread never flush.  As a nice side
benefit, this avoids the expense of atomic operations that have no
purpose for those users.

Change-Id: I1afe415cd09a67f1891c3baf712a9003ad553062
2016-11-13 13:32:08 -08:00
Shawn Pearce 6aa126ec42 Merge "Switch JSchSession to simple isolated OutputStream" 2016-11-13 16:13:04 -05:00
Hugo Arès dea47b9363 Get rid of SoftReference in RepositoryCache
Now that RepositoryCache have a time based eviction strategy, get rid
of the strategy to evict cache entries if heap memory is running low,
i.e. soft references. Main reason why time based eviction was
implemented was to offer an alternative to the unpredictable soft
references.

Relying on soft references is not working, especially in large heap. The
JVM GC will consider collecting soft references as last resort before
throwing an out of memory error. For example, an application like Gerrit
configured with a 128GB heap, GC will wait until all 128GB is filled
before collecting the soft references so the application will be
suffering long pauses caused by GC for a long time already. In other
words, you will have to restart application because it's unusable before
JVM eviction kicks in.

Keeping the SoftReference in RepositoryCache is causing more harm than
good. If you use the time based eviction (which is the default strategy)
and want to tune JVM to release soft references more aggressively, it
will release repositories from the cache even though they are not
expired which defeats the purpose of the repository cache.

Gerrit uses Lucene library which uses soft references and this is
causing a "memory leak" except if you configure JVM to release soft
references more aggressively which have the nasty side effect of
evicting non expired repositories from the cache.

Change-Id: I9940bd800464c7f007696d0ccde52ea617b2ebce
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
2016-11-13 16:03:02 -04:00
Shawn Pearce 659cd813a9 Switch JSchSession to simple isolated OutputStream
Work around issues with JSch not handling interrupts by
isolating the JSch interactions onto another thread.

Run write and flush on a single threaded Executor using
simple Callable operations wrapping the method calls,
waiting on the future to determine the outcome before
allowing the caller to continue.

If any operation was interrupted the state of the stream
becomes fuzzy at close time. The implementation tries to
interrupt the pending write or flush, but this is very
likely to corrupt the stream object, so exceptions are
ignored during such a dirty close.

Change-Id: I42e3ba3d8c35a2e40aad340580037ebefbb99b53
2016-11-13 11:02:29 -08:00
Shawn Pearce 92eab1867d WalkEncryption: Cleanup Java 8 support
Java 8 is now the minimum for JGit, so Java 7
only paths are not necessary.

Change-Id: I0151625fed4d0da95321ebed5cca648b8c29d5f1
2016-11-13 12:17:20 -04:00