The dash signing plugin has been retired hence we need to update our
build to use the CBI jarsigner plugin for signing build results.
Pack test classes to enable signing them.
Also re-enable pack200 for bundle org.eclipse.jgit.
WORKAROUND: there is no easy way to run tests with maven-surefire-plugin
from signed test-jar so for a quick workaround we will have to add a
build step on Hudson so that we can run tests before signing:
- first step will do "clean, verify" to compile and run tests
- second step will do "install, deploy" with profile "eclipse-sign" and
use -DskipTests=true to skip tests since they would hit a
SecurityException when unsigned test classes are in same package as
signed classes under test
- third step will do "clean, install, deploy" on packaging reactor to
build features and p2 repository with profile "eclipse-sign" to sign
and pack200 all bundles.
TODO: Tycho doesn't suport picking up pack200 artifacts via
pomDependencies hence we need to find a way to copy them manually and
use tycho-extra's tycho-p2-extras-plugin:publish-features-and-bundles
to generate the missing p2 metadata.
Change-Id: Iec2c5ab3027a3e3f9ecc0d2f99193385177d9025
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
There was a chance that jgit deletes symbolic links which point to the
folder on top of the working tree. Make sure not to touch these
resources.
Thanks to Cedric Darloy who reported this bug on
http://www.eclipse.org/forums/index.php/m/776910/#msg_776910 and to
Ondrej Vrabec who reported bug 412489.
Bug: 412489
Change-Id: I81735ba0394ef6794e9b2b8bdd8bd7e8b9c6460f
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Robin Stocker <robin@nibor.org>
* It didn't check the first character in the pattern due to a off-by-one
error. Spotted by James Roper.
* It returned true even when pattern was longer than current path, e.g.
it returned that ".txt" is suffix of "txt".
Bug: 411999
Change-Id: I9fbcd68a11fb57cc49956b70c387a47271a0424f
Signed-off-by: Robin Stocker <robin@nibor.org>
parentFile becomes null when f is relative path, such as ".".
This patch avoids NullPointerException in such case.
Change-Id: I4752674b1daab6eedd7c3650c7749462810eaffd
Signed-off-by: Hiroshi Tomita <tomykaira@gmail.com>
Without update, index is wrongly detected to be dirty
when picking the second commit.
Change-Id: Idf47ecb33e8bd38340d760806d629f67be92d2d5
Signed-off-by: Hiroshi Tomita <tomykaira@gmail.com>
Bug: 411963
The original code was able to process only one WWW-Authenticate
header in an HTTP response, and if this header was not one of
two expected, authentication failed regardless of that there
could be other headers in the response.
All WWW-Authenticate headers in an HTTP response have to be
browsed to find one of supported, i.e. Basic or Digest.
By that if both are present, the Digest one should be used
as more preferable.
Bug: 357719
Change-Id: Icf601a41fec63f7d40308f3c85aaa4f71a7c095b
Signed-off-by: Alex Rukhlin <arukhlin@microsoft.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
It's supported by C Git and can be useful.
Bug: 413388
Change-Id: I12c6c10e791cc09ee271d89eb8b8d32f53e385db
Signed-off-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I9754e2124c0fe6ad2dbde5597c3ed10f1c3efef5
Signed-off-by: Lars Vogel <Lars.Vogel@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
With reference hiding, it is possible for a repository to appear
empty when all refs are hidden. This causes capabilities to not be
advertised either, since they are published with the first reference,
breaking fetch by SHA1 support.
Always advertise the capabilites by publishing the symbolic capabilities
reference when the repository has no references to advertise (similar to
the receive service).
Change-Id: I8060e430ee03571dc51239e702864c85e888505c
UploadPack can be invoked with no capabilities selected by the
client if the client is an ancient version of Git that nobody in
their right mind should still be using. Or if the client is very
broken and does not want to use any of the newer features added to
the protocol since its inception.
Change-Id: I3baa6f90e6a41a37a8eab8449a3cc41f4efcb91a
The NullProgressMonitor does not report progress anywhere. Inform the
server not to send progress by enabling the no-progress capability.
Change-Id: Id18dbc754c814d1a5534a284c947030bf201c569
Instead of RevObject list, this allows a custom request validator to be called
on SHA-1's corresponding to objects that may not exist in repository storage
Change-Id: I19bb667beff0d0c144150a61d7a1dc6c9703be7f
Signed-off-by: Greg Hill <greghill@google.com>
This is useful if Git.status() is a long running command.
Change-Id: I6bdbf347a688043d549c1f091fb4a264a6c7024e
Signed-off-by: Christian Trutz <christian.trutz@gmail.com>
Signed-off-by: Robin Stocker <robin@nibor.org>
Can be used for listing remote refs for a repository on the file system
without having a local repository.
Bug: 413400
Change-Id: I397f5092c5eafb62236e9f9e74d9183f56903cc6
Signed-off-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Setting the walk and other fields to null will result in NPEs when the
user e.g. calls fetch on the connection, but at least the advertised
refs can be read like that without having a local repository.
Bug: 413389
Change-Id: I39c8363e81a1c7e6cb3412ba88542ead669e69ed
Signed-off-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Change-Id: I1077dbb1f10c7cc687c0d1b8a8e8f763ca96977c
Signed-off-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Allow filtering of the status. Only files which match given paths are
inspected and only their state is reported.
Change-Id: I3c4b1b46bf297cd4ebdb4997cfa14c8752a36411
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
It had a typo (commited) and was not in the style of the others.
Change-Id: Ia1be1c70b13bb2f3da80c8e8239c5f254070fe60
Signed-off-by: Robin Stocker <robin@nibor.org>
DirCacheCheckout had a bug when the parentdirectory of a worktree was a
symlink. DirCacheCheckout was deleting those symlinks under certain
conditions. This was fixed in I81735ba0394ef6794e9b2b8bdd8bd7e8b9c6460f
without a test because previously it was hard to setup tests containing
symlinks.
BUG: 412489
Change-Id: I2513166af519d6fc01d1eae3976ad6cff6f98530
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
There was a chance that jgit deletes symbolic links which point to the
folder on top of the working tree. Make sure not to touch these
resources.
Thanks to Cedric Darloy who reported this bug on
http://www.eclipse.org/forums/index.php/m/776910/#msg_776910 and to
Ondrej Vrabec who reported bug 412489.
Bug: 412489
Change-Id: I81735ba0394ef6794e9b2b8bdd8bd7e8b9c6460f
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Robin Stocker <robin@nibor.org>
* It didn't check the first character in the pattern due to a off-by-one
error. Spotted by James Roper.
* It returned true even when pattern was longer than current path, e.g.
it returned that ".txt" is suffix of "txt".
Bug: 411999
Change-Id: I9fbcd68a11fb57cc49956b70c387a47271a0424f
Signed-off-by: Robin Stocker <robin@nibor.org>
parentFile becomes null when f is relative path, such as ".".
This patch avoids NullPointerException in such case.
Change-Id: I4752674b1daab6eedd7c3650c7749462810eaffd
Signed-off-by: Hiroshi Tomita <tomykaira@gmail.com>
Make the existing concrete implementations public as well so custom
implementations may delegate to them where appropriate. Treat all custom
implementations as providing allow-tip-sha1 in want.
Change-Id: If386fe25c0d3b4551a97c16a22350714453b03e9
Associate each RequestPolicy with an implementation of a
RequestValidator interface that contains the validation logic. The
checkWants method is only called if there are wants that were not
advertised, since clients may always request any advertised want
according to the git protocol. Calling the method only once at the
end of parsing the want list also means policy implementations can be
stateful, unlike the previous switch statement inside a loop.
For the special handling of unidirectional pipes, simply check
isBiDirectional() and delegate to other implementations if necessary.
Change-Id: I52a174999ac3a5aca46d3469cb0b81edd1710580
C git 1.8.2 supports setting the equivalent of RequestPolicy.TIP with
uploadpack.allowtipsha1. Parse this into TransportConfig and use it
from UploadPack. An explicitly set RequestPolicy overrides the config,
and the policy may still be upgraded on a unidirectional connection to
avoid races.
Defer figuring out the effective RequestPolicy to later in the
process. This is a minor semantic change to fix a bug: previously,
calling setRequestPolicy(ADVERTISED) _after_ calling
setBiDirectionalPipe(true) would have reintroduced the race condition
otherwise fixed by 01888db892.
Change-Id: I264e028a76574434cecb34904d9f5944b290df78
This protocol capability, new in C git 1.8.2, corresponds to
RequestPolicy.TIP, so advertise it if that request policy was set.
Change-Id: I0d52af8a7747e951a87f060a5124f822ce1b2b26
Users of UploadPack may set a custom RefFilter or AdvertisedRefsHook
that limits which refs are advertised, but clients may learn of a
SHA-1 that the server should have as a ref tip through some
alternative means. Support serving such objects from the server side
with a new RequestPolicy.
As with ADVERTISED, we need a special relaxed RequestPolicy to allow
commits reachable from the set of valid tips for unidirectional
connections.
Change-Id: I0d0cc4f8ee04d265e5be8221b9384afb1b374315
Previously it took 1200ms to create a reverse index (sorted by offset).
Using a simple bucket sort algorithm, that time is reduced to 450ms.
The bucket index into the offset array is kept, in order to decrease
the binary search window.
Don't keep a copy of the offsets. Instead, use nth position
to lookup the offset in the PackIndex.
Change-Id: If51ab76752622e04a4430d9a14db95ad02f5329d
Without update, index is wrongly detected to be dirty
when picking the second commit.
Change-Id: Idf47ecb33e8bd38340d760806d629f67be92d2d5
Signed-off-by: Hiroshi Tomita <tomykaira@gmail.com>
Bug: 411963
Currently, the offset can only be retrieved by ObjectId or iterating all
of the entries. Add a method to lookup the offset by position in the
index sorted by SHA1.
Change-Id: I45e9ac8b752d1dab47b202753a1dcca7122b958e
The original code was able to process only one WWW-Authenticate
header in an HTTP response, and if this header was not one of
two expected, authentication failed regardless of that there
could be other headers in the response.
All WWW-Authenticate headers in an HTTP response have to be
browsed to find one of supported, i.e. Basic or Digest.
By that if both are present, the Digest one should be used
as more preferable.
Bug: 357719
Change-Id: Icf601a41fec63f7d40308f3c85aaa4f71a7c095b
Signed-off-by: Alex Rukhlin <arukhlin@microsoft.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
See change I08bed4275af9ec52aa4d7054067ac82f6a3c9781, where fixing such
warning lead to complaints.
If fixing is not wanted, disable it instead.
Change-Id: If31d4028fa1c6377a11e83ed5688b45701cec68b
Introduce a setFilename() method for ArchiveCommand so callers can
specify the intended filename of the produced archive. If the
filename ends with .tar, the format will default to tar; if .zip, zip;
if .tar.gz, gzip-compressed tar; and so on.
This doesn't affect "jgit archive" because it doesn't support the
--output=<file> option yet. A later patch might do that.
Change-Id: Ic0236a70f7aa7f2271c3ef11083b21ee986b4df5
Document archive formats, the archive format interface, and the
parameters of the GitAPIException constructors. Noticed by eclipse.
Reported-by: Dani Megert <Daniel_Megert@ch.ibm.com>
Change-Id: I22b5f9d4c0358bbe867c1906feec7c279e214273
* stable-3.0:
Prepare post 3.0.0-rc2 builds
JGit v3.0.0.201305281830-rc2
Support refspecs with wildcard in middle (not only at end)
Fix multiple bugs in RawSubStringPattern used by MessageRevFilter
Handle short branch/tag name for setBranch in CloneCommand
Add missing Bundle-Localization header
Apply tree filter marks when pairing DiffEntry for renames
Improve feature names to become understandable by end users
Update kepler orbit version to R20130517111416
Fix BatchRefUpdate progress-monitoring so it doesn't count twice
Fix AnyObjectId's generic type declaration of Comparable
Fix DiffFormatter NPEs for DiffEntry without content change
Fix CommitCommand not to destroy repo
Fix the parameters to an exception
Prepare post 3.0.0 M7 builds
JGit v3.0.0.201305080800-m7
Change-Id: Ia8441c9796f01497e0d90e672c0aaf60520a0098
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The following refspec, which can be used to fetch GitHub pull requests,
is supported by C Git but was not yet by JGit:
+refs/pull/*/head:refs/remotes/origin/pr/*
The reason is that the wildcard in the source is in the middle.
This change also includes more validation (e.g. "refs//heads" is not
valid) and test cases.
Bug: 405099
Change-Id: I9bcef7785a0762ed0a98ca95a0bdf8879d5702aa
* Match at end of input was not handled correctly.
* When more than one character matched but not all, the next character
was not considered as a match start (e.g. pattern "abab" didn't match
input "abaabab").
Bug: 409144
Change-Id: Ia44682c618bfbb927f5567c194227421d222a160
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Before, it was not clear from the documentation what kind of branch name
was accepted. Users specifying "branch" (instead of "refs/heads/branch")
got no error message and ended up with a repository without HEAD and no
checkout.
With this, CloneCommand now tries "$branch", then "refs/heads/$branch"
and then "refs/tags/$branch". C Git only does the last two, but for
compatibility we should still allow "refs/heads/branch".
Bug: 390994
Change-Id: I4be13144f2a21a6583e0942f0c7c40da32f2247a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Make call() release all private resources so instead of using a
pattern like
ArchiveCommand cmd = git.archive();
try {
cmd.setTree(tree)
. ...
.call();
} finally {
cmd.release();
}
callers can just use git.archive().setTree(tree)....call() directly.
This involves pushing more work out of parameter setters and into
call() so the ObjectReader is not allocated and potentially leaked
before then.
Change-Id: I699f703c6302696e1cc276d7ab8ee597d82f2c5d
Allow use of ArchiveCommand without depending on the jgit command-line
tools.
To avoid complicating the process of installing and upgrading JGit,
this does not add a dependency by the org.eclipse.jgit bundle on
commons-compress. Instead, the caller is responsible for registering
any formats they want to use by calling ArchiveCommand.registerFormat.
This patch puts functionality that requires an archiver into a
separate org.eclipse.jgit.archive bundle for people who want it. One
can use it by calling ArchiveCommand.registerFormat directly to
register its formats or by relying on OSGi class loading to load
org.eclipse.jgit.archive.FormatActivator, which takes care of
registration automatically.
Once the appropriate formats are registered, you can make a tar or zip
from a git tree object as follows:
ArchiveCommand cmd = git.archive();
try {
cmd.setTree(tree).setFormat(fmt).setOutputStream(out).call();
} finally {
cmd.release();
}
Change-Id: I418e7e7d76422dc6f010d0b3b624d7bec3b20c6e
When using a RenameDetector to generate new DiffEntries after using
DiffEntry.scan, the treeFilterMarks of the original entries were lost.
Now it combines the marks from src and dst.
See EGit bug 335082 where this is used.
Change-Id: I72b34b10ca12e3a6bd10ce44f4fa05b193fc52cc
The stream should not throw IllegalStateException if it is off.
Flush the stream after the hook runs, in case any messages need
to be sent ahead of the pack.
Change-Id: I21c7a0258ab1308406d226293fa0e7da69b4f57b
Before transmitting to the client a hook may want to send along
a text message ahead of the pack, such as a "message of the day".
Enable this usage by mirroring the message sending API from
ReceivePack on the UploadPack instance, using the side band.
Change-Id: I31cd254a4ddb816641397a3e9c2c20212471c37f
I was seeing output like this while running The BFG:
Updating references: 200% (374/187)
...issue sneaked in with 5cf53fda I think.
The update call is also moved to the end of the loop, as update() is
only supposed to be called after work has been done ("Denote that some
work units have been completed").
Change-Id: I1620fa75be16dc80df44745d0e123ea512762e31
Signed-off-by: Robin Stocker <robin@nibor.org>
If you look at any implementation of Comparable in the JDK, you'll see
that the type parameter for Comparable is supposed to be the type of
the implementing class:
http://docs.oracle.com/javase/6/docs/api/java/lang/Comparable.html
The current type signature of Comparable<Object> is pretty awful, at the
very least because you can not, in fact, successfully compare
AnyObjectId with any random subclass of Object. It also causes problems
with type-inference and the scala.math.Ordering trait in Scala.
In order to compile, this change *does* require removing the
AnyObjectId.ompareTo(Object) method - which actually only ever cast
to AnyObjectId in any case. Nothing in the JGit test suite requires this
method, but it might constitute a breaking API change, so it would be
best if it can be added in time for JGit 3.0.
Change-Id: I3b549a5519ccd6785f98e444da76d2363bcbe41a
DiffEntry.getOldId() returns null for a diff without an index line (e.g.
only mode changed, rename without content change).
Bug: 407743
Change-Id: I42eac87421f2a53c985af260a253338f578492bc
There was a severe bug in CommitCommand which could corrupt
repos. When merging an annotated tag the JGit MergeCommand writes
correctly the ID of the tag (and not the id of the commit the tag was
pointing to) into MERGE_HEAD. Native git does the same. But
CommitCommand was reading this file and trusting blindly that it will
contain only IDs of commits. Then the CommitCommand created a
commit which has as parent a non-commit object (the tag object). That's
so corrupt that even native git gives up when you call "git log" in
such a repo.
To reproduce that with EGit simply right-click on a tag in the
Repository View and select Merge. The result was a corrupt repo!
Bug: 336291
Change-Id: I24cd5de19ce6ca7b68b4052c9e73dcc6d207b57c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
A parenthesis was in the wrong place passing arguments to the wrong
format call. Also fix formatting of enclosing switch statement.
Change-Id: I4cb9642f08b58c39033c3a81dab4bd56bebf4fd2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The comment about legacy Tag and Object types no longer applies,
though prior to Idb273d5a92849b42935ac14eed73b796b80aad50 the field
was still being used by RewriteTreeFilter.
Change-Id: I9ee5da8f8a3b61c9cf543817c03117ee0609dd8f
The various rename detection options are an inherent part of the
filter, similar to the path being followed.
This fixes a potential NPE when a RevWalk with a FollowFilter is
created without a Repository, since the old code path tried to get
the DiffConfig from the RevWalk's possibly-missing repository.
Change-Id: Idb273d5a92849b42935ac14eed73b796b80aad50
The most important difference is that in Java7 we have symbolic links
and for most operations in the work tree we want to operate on the link
itself rather than the link target, which the old File methods generally
do.
We also add support for the hidden attribute, which only makes sense
on Windows and exists, just since there are claims that Files.exists
is faster the File.exists.
A new bundle is only activated when run with a Java7 execution
environment. It is implemented as a fragment.
Tycho currently has no way to conditionally include optional features
based on the java version used to run the build, this means with this
change the jgit packaging build always needs to be run using java 7.
Change-Id: I3d6580d6fa7b22f60d7e54ab236898ed44954ffd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When either --tags or a tag ref is explicitly specified on fetch, C Git
updates existing local tags if they are different.
Before this change, JGit returned REJECTED in such a case. Now it
updates it and returns FORCED.
Example:
% mkdir a
% cd a
% git init -q
% touch test.txt
% git add test.txt
% git commit -q -m 'Initial'
% git tag v1
% cd ..
% git clone -q a b
% cd a
% echo Test > test.txt
% git commit -q -a -m 'Second'
% git tag -f v1
Updated tag 'v1' (was bc85c08)
% cd ../b
% git fetch --tags
- [tag update] v1 -> v1
Bug: 388095
Change-Id: I5d5494c2ad1a2cdb8e9e614d3de445289734edfe
This corresponds to what C Git does, quoting from the fetch man page:
This is done by first fetching from the remote using the given
<refspec>s, and if the repository has objects that are pointed by
remote tags that it does not yet have, then fetch those missing tags.
Before, JGit would also fetch tags that exist locally but point to a
different object, resulting in REJECTED results for these.
Also add some test cases to cover more cases.
Bug: 388095
Change-Id: Ib03d2d82e9c4b60179d626cfd5174be1da6388b2
Also-by: Stefan Lay <stefan.lay@sap.com>
Depending on the order in which items are traversed for RECURSIVE, an
empty directory may come first before detecting that there is a file and
aborting.
This fixes it by traversing files first.
Bug: 405558
Change-Id: I638b7da58e33ffeb0fee172b96f4c823943d29e9
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If the HEAD is not present in a repository, then there is a
NullPointerException thrown in the delete code. Since this only
exists to verify if the deletion is not the HEAD reference, then
skip this check if the HEAD cannot be found.
Bug: 406722
Change-Id: I882497202d986096513a4d791cd07fa935a3f9e4
Signed-off-by: Alex Blewitt <alex.blewitt@gmail.com>
JGit doesn't currently use java.util.logging.Logger. Remove this
never-used Logger introduced in ab99b78ca0 (Implement recursive
merge strategy, 2013-02-21) to make that easier to see.
Change-Id: I92c578e7f3617085a667de7c992174057be3eb71
Adds a new method getConflictingStageStates() which returns a
Map<String, StageState> (path to stage state). StageState is an enum for
all possible stage combinations (BOTH_DELETED, ADDED_BY_US, ...).
This can be used to implement the conflict text for unmerged paths in
output of "git status" or in EGit for decorations/hints.
Bug: 403697
Change-Id: Ib461640a43111b7df4a0debe92ff69b82171329c
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
Instead of counting objects processed, count number of bytes added
into the window. This should rescale the progress meter so that 30%
complete means 30% of the total uncompressed content size has been
inflated and fed into the window.
In theory the progress meter should be more accurate about its
percentage complete/remaining fraction than with objects. When
counting objects small objects move the progress meter more rapidly
than large objects, but demand a smaller amount of work than large
objects being compressed.
Change-Id: Id2848c16a2148b5ca51e0ca1e29c5be97eefeb48
Instead of assuming all objects cost the same amount of time to
delta compress, aggregate the byte size of objects in the list
and partition threads with roughly equal total bytes.
Before splitting the list select the N largest paths and assign
each one to its own thread. This allows threads to get through the
worst cases in parallel before attempting smaller paths that are
more likely to be splittable.
By running the largest path buckets first on each thread the likely
slowest part of compression is done early, while progress is still
reporting a low percentage. This gives users a better impression of
how fast the phase will run. On very complex inputs the slow part
is more likely to happen first, making a user realize its time to
go grab lunch, or even run it overnight.
If the worst sections are earlier, memory overruns may show up
earlier, giving the user a chance to correct the configuration and
try again before wasting large amounts of time. It also makes it
less likely the delta compression phase reaches 92% in 30 minutes
and then crawls for 10 hours through the remaining 8%.
Change-Id: I7621c4349b99e40098825c4966b8411079992e5f
By excluding objects the compactor can avoid storing objects that
are already well packed in the base GC packs, or any other pack
not being replaced by the current compaction operation.
For deltas the base object is still included even if the base exists
in another exclusion set. This favors keeping deltas for recent
history, to support faster fetch operations for clients.
Change-Id: Ie822fe075fe5072fe3171450fda2f0ca507796a1
Use recursive merge as the default strategy since it can successfully
merge more cases than the resolve strategy can. This is also the default
in native Git.
Change-Id: I38fd522edb2791f15d83e99038185edb09fed8e1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Previously, the code assumed all commits in the old pack would also
be present in the new pack. This assumption caused an
ArrayIndexOutOfBoundsException during remapping of ids. Fix the
iterator to only return entries that may be remapped. Furthermore,
update getBitmap() to return null if commit does not exist in the
new pack.
Change-Id: I065babe8cd39a7654c916bd01c7012135733dddf
This fixes some problems with inputs around the size of the internal
buffer in AutoCRLFOutputStream (8000).
Tests supplied by Robin Stocker.
Bug: 405672
Change-Id: I6147897290392b3bfd4040e8006da39c302a3d49
* changes:
Always attempt delta compression when reuseDeltas is false
Avoid TemporaryBuffer.Heap on very small deltas
Correct distribution of allowed delta size along chain length
Split remaining delta work on path boundaries
Replace DeltaWindow array with circularly linked list
Micro-optimize copy instructions in DeltaEncoder
Micro-optimize DeltaWindow primary loop
Micro-optimize DeltaWindow maxMemory test to be != 0
Mark DeltaWindowEntry methods final
If reuseObjects=true but reuseDeltas=false the caller wants attempt
a delta for every object in the input list. Test for reuseDeltas
to ensure every object passes through the searchInWindow() method.
If no delta is possible for an object and it will be stored whole
(non-delta format), PackWriter may still reuse its content from any
source pack. This avoids an inflate()-deflate() cycle to recompress
the object contents.
Change-Id: I845caeded419ef4551ef1c85787dd5ffd73235d9
TemporaryBuffer is great when the output size is not known, but must
be bound by a relatively large upper limit that fits in memory, e.g.
64 KiB or 20 MiB. The buffer gracefully supports growing storage by
allocating 8 KiB blocks and storing them in an ArrayList.
In a Git repository many deltas are less than 8 KiB. Typical tree
objects are well below this threshold, and their deltas must be
encoded even smaller.
For these much smaller cases avoid the 8 KiB minimum allocation used
by TemporaryBuffer. Instead allocate a very small OutputStream
writing to an array that is sized at the limit.
Change-Id: Ie25c6d3a8cf4604e0f8cd9a3b5b701a592d6ffca
Nicolas Pitre discovered a very simple rule for selecting between two
different delta base candidates:
- if based whole object, must be <= 50% of target
- if at end of a chain, must be <= 1/depth * 50% of target
The rule penalizes deltas near the end of the chain, requiring them to
be very small in order to be kept by the packer. This favors deltas
that are based on a shorter chain, where the read-time unpack cost is
much lower. Fewer bytes need to be consulted from the source pack
file, and less copying is required in memory to rebuild the object.
Junio Hamano explained Nico's rule to me today, and this commit fixes
DeltaWindow to implement it as described.
When no base has been chosen the computation is simply the statements
denoted above. However once a base with depth of 9 has been chosen
(e.g. when pack.depth is limited to 10), a non-delta source may
create a new delta that is up to 10x larger than the already selected
base. This reflects the intent of Nico's size distribution rule no
matter what order objects are visited in the DeltaWindow.
With this patch and my other patches applied, repacking JGit with:
[pack]
reuseObjects = false
reuseDeltas = false
depth = 50
window = 250
threads = 4
compression = 9
CGit (all) 5,711,735 bytes; real 0m13.942s user 0m47.722s [1]
JGit heads 5,718,295 bytes; real 0m11.880s user 0m38.177s [2]
rest 9,809 bytes
The improved JGit result for the head pack is only 6.4 KiB larger than
CGit's resulting pack. This patch allowed JGit to find an additional
39.7 KiB worth of space savings. JGit now also often runs 2s faster
than CGit, despite also creating bitmaps and pruning objects after the
head pack creation.
[1] time git repack -a -d -F --window=250 --depth=50
[2] time java -Xmx128m -jar jgit debug-gc
Change-Id: I5caec31359bf7248cabdd2a3254c84d4ee3cd96b
When an idle thread tries to steal work from a sibling's remaining
toSearch queue, always try to split along a path boundary. This
avoids missing delta opportunities in the current window of the
thread whose work is being taken.
The search order is reversed to walk further down the chain from
current position, avoiding the risk of splitting the list within
the path the thread is currently processing.
When selecting which thread to split from use an accurate estimate
of the size to be taken. This avoids selecting a thread that has
only one path remaining but may contain more pending entries than
another thread with several paths remaining.
As there is now a race condition where the straggling thread can
start the next path before the split can finish, the stealWork()
loop spins until it is able to acquire a split or there is only
one path remaining in the siblings.
Change-Id: Ib11ff99f90a4d9efab24bf4a85342cc63203dba5
PackWriter generally chooses the order for objects when it builds the
object lists. This ordering already depends on history information to
guide placing more recent objects first and historical objects last.
Allow PackWriter to make the basic ordering decisions, instead of
trying to override them. The old approach of sorting the list caused
DfsReader to override any ordering change PackWriter might have tried
to make when repacking a repository.
This now better matches with WindowCursor's implementation, where
PackWriter solely determines the object ordering.
Change-Id: Ic17ab5631ec539f0758b962966c3a1823735b814
Typical window sizes are 10 and 250 (although others are accepted).
In either case the pointer overhead of 1 pointer in an array or
2 pointers for a double linked list is trivial. A doubly linked
list as used here for window=250 is only another 1024 bytes on a
32 bit machine, or 2048 bytes on a 64 bit machine.
The critical search loops scan through the array in either the
previous direction or the next direction until the cycle is finished,
or some other scan abort condition is reached. Loading the next
object's pointer from a field in the current object avoids the
branch required to test for wrapping around the edge of the array.
It also saves the array bounds check on each access.
When a delta is chosen the window is shuffled to hoist the currently
selected base as an earlier candidate for the next object. Moving
the window entry is easier in a double-linked list than sliding a
group of array entries.
Change-Id: I9ccf20c3362a78678aede0f0f2cda165e509adff
The copy instruction formatter should not to compute the shifts and
masks twice. Instead compute them once and assume there is a register
available to store the temporary "b" for compare with 0.
Change-Id: Ic7826f29dca67b16903d8f790bdf785eb478c10d
javac and the JIT are more likely to understand a boolean being
used as a branch conditional than comparing int against 0 and 1.
Rewrite NEXT_RES and NEXT_SRC constants to be booleans so the
code is clarified for the JIT.
Change-Id: I1bdd8b587a69572975a84609c779b9ebf877b85d
Instead of using a compare-with-0 use a does not equal 0.
javac bytecode has a special instruction for this, as it
is very common in software. We can assume the JIT knows
how to efficiently translate the opcode to machine code,
and processors can do != 0 very quickly.
Change-Id: Idb84c1d744d2874517fd4bfa1db390e2dbf64eac
This class and all of its methods are only package visible.
Clarify the methods as final for the benefit of the JIT to
inline trivial code.
Change-Id: I078841f9900dbf299fbe6abf2599f0208ae96856
* changes:
Increase PackOutputStream copy buffer to 64 KiB
Tighten object header writing in PackOutuptStream
Skip main thread test in ThreadSafeProgressMonitor
Declare members of PackOutputStream final
Always allocate the PackOutputStream copyBuffer
Disable CRC32 computation when no PackIndex will be created
Steal work from delta threads to rebalance CPU load
Colby just pointed out to me the buffer was 16 KiB. This may
be very small for common objects. Increase to 64 KiB.
Change-Id: Ideecc4720655a57673252f7adb8eebdf2fda230d
Most objects are written as OFS_DELTA with the base in the pack,
that is why this case comes first in writeHeader(). Rewrite the
condition to always examine this first and cache the PackWriter's
formatting flag for use of OFS_DELTA headers, in modern Git networks
this is true more often then it it is false.
Assume the cost of write() is high, especially due to entering the
MessageDigest to update the pack footer SHA-1 computation. Combine
the OFS_DELTA information as part of the header buffer so that the
entire burst is a single write call, rather than two relatively
small ones. Most OFS_DELTA headers are <= 6 bytes, so this rewrite
tranforms 2 writes of 3 bytes each into 1 write of ~6 bytes.
Try to simplify the objectHeader code to reduce branches and use
more local registers. This shouldn't really be necessary if the
compiler is well optimized, but it isn't very hard to clarify data
usage to either javac or the JIT, which may make it easier for the
JIT to produce better machine code for this method.
Change-Id: I2b12788ad6866076fabbf7fa11f8cce44e963f35
update(int) is only invoked from a worker thread, in JGit's case
this is DeltaTask. The Javadoc of TSPM suggests update should only
ever be used by a worker thread.
Skip the main thread check, saving some cycles on each run of the
progress monitor.
Change-Id: I6cb9382d71b4cb3f8e8981c7ac382da25304dfcb
These methods cannot be sanely overridden anywhere. Most methods
are package visible only, or are private. A few public methods do
exist but there is no useful way to override them since creation
of PackOutputStream is managed by PackWriter and cannot be delegated.
Change-Id: I12cd3326b78d497c1f9751014d04d1460b46e0b0