Commit Graph

2945 Commits

Author SHA1 Message Date
Shawn Pearce 6a5019f539 Renumber internal ObjectToPack flags
Now that WANT_WRITE is gone renumber the flags to move the unused
bit next to the type. Recluster AS_IS and DELTA_ATTEMPTED to be
next to each other since these bits are tested as a pair.

Change-Id: I42994b5ff1f67435e15c3f06d02e3b82141e8f08
2013-04-04 19:44:41 -07:00
Shawn Pearce 241eed844d Move wantWrite flag to be special offset 1
Free up the WANT_WRITE flag in ObjectToPack by switching the test
to use the special offset value of 1. The Git pack file format
calls for the first 4 bytes to be 'PACK', which means any object
must start at an offset >= 4. Current versions require another 8
bytes in the header, placing the first object at offset = 12.

So offset = 1 is an invalid location for an object, and can be
used as a marker signal to indicate the writing loop has tried
to write the object, but recursed into the base first. When an
object is visited with offset == 1 it means there is a cycle in
the delta base path, and the cycle must be broken.

Change-Id: I2d05b9017c5f9bd9464b91d43e8d4b4a085e55bc
2013-04-04 17:53:01 -07:00
Shawn Pearce 1eed78657f Don't delta compress garbage objects
Garbage is randomly ordered and unlikely to delta compress against
other garbage. Disable delta compression allowing objects to switch
to whole form when moving to the garbage pack.

Because the garbage is not well compressed assume deltas were not
attempted during a normal GC cycle.

Override the reuse settings, garbage that can be reused should be
reused as-is into the garbage pack rather than switching something
like the compression level during a GC. It is intended that garbage
will eventually be removed from the repository so expending CPU
time on a compression switch is not worthwhile.

Change-Id: I0e8e58ee99e5011d375d3d89c94f2957de8402b9
2013-04-04 15:25:56 -07:00
Shawn Pearce 56497be34d Delete broken DFS read-ahead support
This implementation has been proven to deadlock in production server
loads. Google has been running with it disabled for a quite a while,
as the bugs have been difficult to identify and fix.

Instead of suggesting it works and is useful, drop the code. JGit
should not advertise support for functionality that is known to
be broken.

In a few of the places where read-ahead was enabled by DfsReader
there is more information about what blocks should be loaded when.
During object representation selection, or size lookup, or sending
object as-is to a PackWriter, or sending an entire pack as-is the
reader knows exactly which blocks are required in the cache, and it
also can compute when those will be needed. The broken read-ahead
code was stupid and just read a fixed amount ahead of the current
offset, which can waste IOs if more precise data was available.

DFS systems are usually slow to respond so read-ahead is still
a desired feature, but it needs to be rebuilt from scratch and
make better use of the offset information.

Change-Id: Ibaed8288ec3340cf93eb269dc0f1f23ab5ab1aea
2013-04-04 15:14:23 -07:00
Robin Rosenberg d90656f536 Fix a possible NPE
String.valueOf is an overloaded and the compiler unfortunately picks
the wrong one since null contains no type information.

Change-Id: Icd197eaa046421f3cfcc5bf3e7601dc5bc7486b6
2013-04-04 18:08:06 -04:00
Shawn Pearce d72416afbb Optimize DFS object reuse selection code
Rewrite this complicated logic to examine each pack file exactly
once. This reduces thrashing when there are many large pack files
present and the reader needs to locate each object's header.

The intermediate temporary list is now smaller, it is bounded to
the same length as the input object list. In the prior version of
this code the list contained one entry for every representation of
every object being packed.

Only one representation object is allocated, reducing the overall
memory footprint to be approximately one reference per object found
in the current pack file (the pointer in the BlockList). This saves
considerable working set memory compared to the prior version that
made and held onto a new representation for every ObjectToPack.

Change-Id: I2c1f18cd6755643ac4c2cf1f23b5464ca9d91b22
2013-04-04 14:21:34 -07:00
Shawn Pearce 93a27ce728 Simplify size test in PackWriter
Clip the configured limit to Integer.MAX_VALUE at the top of the
loop, saving a compare branch per object considered. This can cut
2M branches out of a repacking of the Linux kernel.

Rewrite the logic so the primary path is to match the conditional;
most objects are larger than BLKSZ (16 bytes) and less than limit.
This may help branch prediction on CPUs if the CPU tries to assume
execution takes the side of the branch and not the second.

Change-Id: I5133d1651640939afe9fbcfd8cfdb59965c57d5a
2013-04-04 11:25:57 -07:00
Shawn Pearce d45277a691 Declare critical exposed methods of ObjectToPack final
There is no reasonable way for a subclass to correctly override and
implement these methods. They depend on internal state that cannot
otherwise be managed.

Most of these methods are also in critical paths of PackWriter.
Declare them final so subclasses do not try to replace them,
and so the JIT knows the smaller ones can be safely inlined.

Change-Id: I9026938e5833ac0b94246d21c69a143a9224626c
2013-04-04 11:18:41 -07:00
Shawn Pearce 1d362e35bc Declare internal flag accessors of ObjectToPack final
None of these methods should ever be overridden at runtime by an
extension class. Given how small they are the JIT should perform
inlining where reasonable. Hint this is possible by marking all
methods final so its clear no replacement can be loaded later on.

Change-Id: Ia75a5d36c6bd25b24169e2bdfa360c8f52b669cd
2013-04-04 11:16:04 -07:00
Shawn Pearce 876a2ffb21 Remove unused method isDeltaAttempted()
This flag is never checked on its own. It is only checked as part
of a pair through the doNotAttemptDelta() method. Delete the method
so there is less confusion about the flag being used on its own.

Change-Id: Id7088caa649599f4f11d633412c2a2af0fd45dd8
2013-04-04 11:04:32 -07:00
Shawn Pearce 594d4ceb12 Simplify setDoNotDelta() to always set the flag
This method is only invoked with true as the argument.
Remove the unnecessary parameter and branch, making
the code easier for the JIT to optimize.

Change-Id: I68a9cd82f197b7d00a524ea3354260a0828083c6
2013-04-04 10:56:42 -07:00
Tomasz Zarna 5453585773 Add the no-commit option to MergeCommand
Added also tests and the associated option for the command line Merge
command.

Bug: 335091
Change-Id: Ie321c572284a6f64765a81674089fc408a10d059
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-04-04 15:11:49 +02:00
Christian Halstrick 81b601de53 Merge "Fix PathFilterGroup not to throw StopWalkException too early" 2013-04-04 03:42:25 -04:00
Christian Halstrick ac0481039d Merge "Indicate initial commit on a branch in the reflog" 2013-04-04 03:41:56 -04:00
Robin Rosenberg c9a94dc1ee Fix PathFilterGroup not to throw StopWalkException too early
Due to the Git internal sort order a directory is sorted as if it ended
with a '/', this means that the path filter didn't set the last possible
matching entry to the correct value. In the reported issue we had the
following filters.

	org.eclipse.jgit.console
	org.eclipse.jgit

As an optimization we throw a StopWalkException when the walked tree
passes the last possible filter, which was this:
	org.eclipse.jgit.console

Due to the git sorting order, the tree was processed in this order:
	org.eclipse.jgit.console
	org.eclipse.jgit.test
	org.eclipse.jgit

At org.eclipse.jgit.test we threw the StopWalkException preventing the
walk from completing successfully.

A correct last possible match should be:
	org.eclipse.jgit/

For simplicit we define it as:
	org/eclipse/jgit/

This filter would be the maximum if we also had e.g. org and org.eclipse
in the filter, but that would require more work so we simply replace all
characters lower than '/' by a slash.

We believe the possible extra walking does not not warrant the extra
analysis.

Bug: 362430
Change-Id: I4869019ea57ca07d4dff6bfa8e81725f56596d9f
2013-04-03 14:07:23 -04:00
Robin Rosenberg 65027d8bb4 Indicate initial commit on a branch in the reflog
Bug: 393463
Change-Id: I4733d6f719bc0dc694e7a6a6ad2092de6364898c
2013-04-02 21:57:17 +02:00
Arthur Baars 35be98fb8f LogCommand.all(): filter out refs that do not refer to commit objects
1. I have authored 100% of the content I'm contributing,
 2. I have the rights to donate the content to Eclipse,
 3. I contribute the content under the EDL

Change-Id: I48b1828e0b1304f76276ec07ebac7ee9f521b194
2013-03-31 15:36:47 +01:00
Arthur Baars 2b9c440fd1 LogCommand.all(), peel references before using them
Problem:
LogCommand.all() throws an IncorrectObjectTypeException when
there are tag references, and the repository does not contain
the file "packed-refs". It seems that the references were not properly
peeled before being added to the markStart() method.

Solution:
Call getRepository().peel() on every Ref that has isPeeled()==false
in LogCommand.all() .

Added test case for LogCommand.all() on repo with a tag.

 1. I have authored 100% of the content I'm contributing,
 2. I have the rights to donate the content to Eclipse,
 3. I contribute the content under the EDL

Bug: 402025
Change-Id: Idb8881eeb6ccce8530f2837b25296e8e83636eb7
2013-03-31 15:36:47 +01:00
Robin Rosenberg 5cf53fdacf Speed up clone/fetch with large number of refs
Instead of re-reading all refs after each update, execute
the deletes first, then read all refs once and perform
the check for conflicting ref names in memory.

Change-Id: I17d0b3ccc27f868c8497607d8e57bf7082e65ba3
2013-03-30 13:36:44 +01:00
Robin Rosenberg 4796fe7043 Merge "When renaming the lock file succeeds the lock isn't held anymore" 2013-03-28 15:57:38 -04:00
Shawn Pearce 1f51aecf95 Fix CommitCommand amend mode to preserve parent order
Change-Id: I476921ff8dfa6a357932d42ee59340873502b582
2013-03-28 13:58:21 -04:00
Andreas König d9d3439617 Fixed parsing of URI with a IPv6-address
Allowed ipv6-address in a uri like:
  http://[::1]:8080/repo.git

Change-Id: Ia00a20f694b2e9314892df77f9b11f551bb1d34e
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
2013-03-27 10:44:13 -04:00
François Rey 741ecf56b7 New functions to facilitate the writing of CLI test cases
Writing CLI test cases is tedious because of all the formatting and
escaping subtleties needed when comparing actual output with what's
expected. While creating a test case the two new functions are to be
used instead of the existing execute() in order to prepare the correct
command and expected output and to generate the corresponding test code
that can be pasted into the test case function.

Change-Id: Ia66dc449d3f6fb861c300fef8b56fba83a56c94c
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
2013-03-27 09:26:13 -04:00
Matthias Sohn edd47d10b9 Merge "File.renameTo behaves differently on Unix and Windows" 2013-03-27 09:09:04 -04:00
Matthias Sohn b1d191a155 Merge "Extend FileUtils.rename to common git semantics" 2013-03-27 09:03:23 -04:00
Matthias Sohn d059f85c0b When renaming the lock file succeeds the lock isn't held anymore
This wrong book-keeping caused IOExceptions to be thrown because
LockFile.unlock() erroneously tried to delete the non-existing lock
file. These IOExeptions were hidden since they were silently caught. 

Change-Id: If42b6192d92c5a2d8f2bf904b16567ef08c32e89
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-26 21:26:07 +01:00
Shawn Pearce 7f1c2ec1eb Always add FileExt to DfsPackDescription
Instead of forcing the implementation of the DFS backend to handle
making sure the extension bits are set correctly, have the common
callers in JGit set the extension at the same time they supply the
file sizes to the pack description. This simplifies assumptions for
an implementation of the DFS backend.

Change-Id: I55142ad8ea08a3e2e8349f72b3714578eba9c342
2013-03-26 14:00:57 -04:00
Robin Rosenberg edf0da9c6e File.renameTo behaves differently on Unix and Windows
On Windows renameTo will not overwrite a file, so it must be deleted
first. The fix for Bug 402834 did not account for that.

Bug: 403685
Change-Id: I3453342c17e064dcb50906a540172978941a10a6
2013-03-26 00:48:44 +01:00
Robin Rosenberg d0e92885e9 Extend FileUtils.rename to common git semantics
Unlike the OS or Java rename this method will (on *nix) try (on Windows)
replace the target with the source provided the target does not exist,
the target does exist and is a file, or if it is a directory which only
contains directories. In the latter case the directory hierarchy will be
deleted.
If the initial rename fails and the target is an existing file the the
target file will be deleted first and then the rename is retried.

Change-Id: Iae75c49c85445ada7795246a02ce02f7c248d956
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2013-03-26 00:48:00 +01:00
Christian Halstrick e7669c44e3 Merge "Add tests for FileUtils.delete and EMPTY_DIREECTORIES_ONLY" 2013-03-24 18:59:31 -04:00
Matthias Sohn 6b3515c3ee Update build to Tycho 0.17
Change-Id: I92c9757a37644ec48ed1d785f4dacd6c44276632
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-24 04:31:23 +01:00
Robin Rosenberg 7aa54967a2 Add tests for FileUtils.delete and EMPTY_DIREECTORIES_ONLY
Change-Id: I54a46c29df5eafc7739a6ef29e5dc80fa2f6d9ba
2013-03-24 00:49:23 +01:00
Matthias Sohn 9b63f32441 SimpleHttpServer API shouldn't expose internals
Change-Id: I5963ae720f33cb148de08b4c64d02c81d6791139
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-22 21:36:10 +01:00
Matthias Sohn a040a8f127 Grant access to jgit internals to junit and http.server bundles
Change-Id: Ib34f9635b4d060f5d17a6c823ec91af1d934a180
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-22 21:35:16 +01:00
Matthias Sohn dd6f41e401 Add missing @since tags
Change-Id: I6b20d78e6bd1f245fdca331554c106f8bae44b9c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-22 21:21:07 +01:00
Tomasz Zarna 48f30b8614 Fix @since tags in JGit, version 2.4 never existed
Change-Id: Iaca88ec28b412e6b58e7b39a0762ba54b25f9471
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-21 18:03:20 -04:00
Shawn Pearce 9aee4e0a26 Merge changes If98b0b97,I7c9c09b4
* changes:
  Add convenience factory method for most used builder pattern
  Don't use internal type FileRepository in public API
2013-03-21 03:52:33 -04:00
André Dietisheim a31920555f Allow users to show server messages while pushing
Allow users to provide their OutputStream (via Transport#
push(monitor, refUpdates, out)) so that server messages can be written
to it (in SideBandInputStream) while they're coming in.

CQ: 7065
Bug: 398404
Change-Id: I670782784b38702d52bca98203909aca0496d1c0
Signed-off-by: Andre Dietisheim <andre.dietisheim@gmail.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-21 00:30:30 +01:00
Matthias Sohn 8fcde4b31b Don't verify host name when sslVerify is false
Native git also doesn't verify host names when http.sslVerify=false.
See native git's commit a5ccc597.

See: http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg02047.html
Change-Id: I42f509fea8e4ac89fad646aec3dfbf1753ae7e3d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-19 20:39:38 -04:00
Matthias Sohn 82abba56e5 Fix line endings and whitespace errors in jgit feature
Change-Id: I9fc69fccedf362453f74f1e09d2b50ac705a9cac
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-20 01:27:02 +01:00
Edwin Kempin e02708a8b3 Fix formatting of PackConfig.toString() & GC.RepoStatistics.toString()
Change-Id: I7e0c74ecfd0e0615d10fb582b2897d33be23440a
Signed-off-by: Edwin Kempin <edwin.kempin@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-20 00:48:30 +01:00
Edwin Kempin b37b1c9165 Allow to get repo statistics from GarbageCollectionCommand before gc
When running the garbage collection for a repository it is often
interesting to compare the repository statistics from before and after
the garbage collection to understand the effect of the garbage
collection. This is why it makes sense that the
GarbageCollectionCommand provides a method to retrieve the repository
statistics before running the garbage collection.

So far without running the garbage collection the repository statistics
can only be retrieved by using JGit internal classes. This is what EGit
and Gerrit do at the moment, but it would be better to have an API for
this.

Change-Id: Id7e579157e9fbef5cfd1fc9f97ada45f0ca8c379
Signed-off-by: Edwin Kempin <edwin.kempin@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-20 00:46:27 +01:00
Matthias Sohn 38cac0acf3 Add convenience factory method for most used builder pattern
This will simplify to adapt EGit to the removal of FileRepository from
jgit's public API in change I2ab1327c202ef2003565e1b0770a583970e432e9.

Change-Id: If98b0b97e8f13a94d4ea7ba1be0f90d82b0fba4b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-20 00:44:24 +01:00
Matthias Sohn d35586a431 Don't use internal type FileRepository in public API
Change-Id: I7c9c09b4f190fa7cb830563bcdf2071407ee2ce0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-20 00:44:24 +01:00
Matthias Sohn 509c0b58ee Merge "Fix GC for FileRepo in case packfile renames fail" 2013-03-19 13:07:47 -04:00
Shawn Pearce 60f5f46550 Fix location of DfsText.properties
The file was not moved when the package was renamed to internal.

Change-Id: I29a078d6316daa4e4407db9ecedc8b7ed05535cd
2013-03-19 07:16:48 -07:00
Christian Halstrick bd5e4eabc2 Fix GC for FileRepo in case packfile renames fail
Only on Windows the rename operation which renames temporary Packfiles
(and index-files and bitmap-files) sometime fails. This happens only
when renaming a temporary Packfile to a Packfile which already exists.
Such situations occur if you run GC twice on a repo without modifying
the repo inbetween.

In such situations there was bug in GC which led to a corrupted repo
whithout any packfiles anymore. This commit fixes the problem by
introducing a utility method which renames a file and throws an
IOException if it fails. This method also takes care to repeat a
failing rename if our FS class has found out we are running on a
platform with a unreliable File.renameTo() method.

I am searching for a better solution because even with this utility
method in hand a GC on a already GC'ed repo will fail on Windows. But
at least with this fix we will not produce corrupted repos anymore.

Bug: 389305
Change-Id: Iac1ab3e0b8c419c90404f2e2f3559672eb8f6d28
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2013-03-19 14:28:24 +01:00
Christian Halstrick 67b98d5d40 Make GC more robust against corrupt reflogs
With JGit it is possible to write reflog entries where new objectid and
old objectid is null. Such reflogs cause FileRepository GC to crash
because it doesn't expect the new objectid to be null. One case where
this happened is in Gerrit's allProjects repo. In the same way as we
expect the old objectid to be potentially null we should also ignore
null values in the new objectid column.

Change-Id: Icf666c7ef803179b84306ca8deb602369b8df16e
2013-03-19 11:23:45 +01:00
Shawn Pearce f32b861243 JGit 3.0: move internal classes into an internal subpackage
This breaks all existing callers once. Applications are not supposed
to build against the internal storage API unless they can accept API
churn and make necessary updates as versions change.

Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9
2013-03-18 09:30:43 -07:00
Shawn Pearce 462bbc052e Merge changes I2645d482,Ic81fefb1,Id64ab38d
* changes:
  Remove cached_packs support in favor of bitmaps
  Remove objects before optimization from DfsGarbageCollector
  Simplfy caching of DfsPackDescription from PackWriter.Statistics
2013-03-18 10:35:31 -04:00