Commit Graph

1360 Commits

Author SHA1 Message Date
Dave Borowitz dff9d56b94 Expose the list of pack files in the DfsBlockCache
Callers may want to inspect the contents of the cache, which this allows
them to do in a read-only fashion without any locking.

Change-Id: Ifd78e8ce34e26e5cc33e9dd61d70c593ce479ee0
2011-11-04 11:14:32 -07:00
Dave Borowitz 35d72ac806 Add a DFS repository description and reference it in each pack
Just as DfsPackDescription describes a pack but does not imply it is
open in memory, a DfsRepositoryDescription describes a repository at a
basic level without it necessarily being open.

Change-Id: I890b5fccdda12c1090cfabf4083b5c0e98d717f6
2011-11-04 11:14:32 -07:00
Dave Borowitz 5a38e5b440 Clarify the docstring of DfsBlockCache.reconfigure()
The docstring was copied from the local filesystem cache code, which
actually attempted to reconfigure the cache on the fly. The DFS cache is
designed to be "reconfigured" exactly once.

Change-Id: Ia0b01f5d6b6b3d3a68d65a5c229ff67c1cede5bc
2011-11-04 11:14:32 -07:00
Shawn O. Pearce fa4cc2475f DFS: A storage layer for JGit
In practice the DHT storage layer has not been performing as well as
large scale server environments want to see from a Git server.

The performance of the DHT schema degrades rapidly as small changes
are pushed into the repository due to the chunk size being less than
1/3 of the pushed pack size.  Small chunks cause poor prefetch
performance during reading, and require significantly longer prefetch
lists inside of the chunk meta field to work around the small size.

The DHT code is very complex (>17,000 lines of code) and is very
sensitive to the underlying database round-trip time, as well as the
way objects were written into the pack stream that was chunked and
stored on the database.  A poor pack layout (from any version of C Git
prior to Junio reworking it) can cause the DHT code to be unable to
enumerate the objects of the linux-2.6 repository in a completable
time scale.

Performing a clone from a DHT stored repository of 2 million objects
takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row
for each object being cloned. This is very difficult for some DHTs to
scale, even at 5000 rows/second the lookup stage alone takes 6 minutes
(on local filesystem, this is almost too fast to bother measuring).
Some servers like Apache Cassandra just fall over and cannot complete
the 2 million lookups in rapid fire.

On a ~400 MiB repository, the DHT schema has an extra 25 MiB of
redundant data that gets downloaded to the JGit process, and that is
before you consider the cost of the OBJECT_INDEX table also being
fully loaded, which is at least 223 MiB of data for the linux kernel
repository.  In the DHT schema answering a `git clone` of the ~400 MiB
linux kernel needs to load 248 MiB of "index" data from the DHT, in
addition to the ~400 MiB of pack data that gets sent to the client.
This is 193 MiB more data to be accessed than the native filesystem
format, but it needs to come over a much smaller pipe (local Ethernet
typically) than the local SATA disk drive.

I also never got around to writing the "repack" support for the DHT
schema, as it turns out to be fairly complex to safely repack data in
the repository while also trying to minimize the amount of changes
made to the database, due to very common limitations on database
mutation rates..

This new DFS storage layer fixes a lot of those issues by taking the
simple approach for storing relatively standard Git pack and index
files on an abstract filesystem. Packs are accessed by an in-process
buffer cache, similar to the WindowCache used by the local filesystem
storage layer. Unlike the local file IO, there are some assumptions
that the storage system has relatively high latency and no concept of
"file handles". Instead it looks at the file more like HTTP byte range
requests, where a read channel is a simply a thunk to trigger a read
request over the network.

The DFS code in this change is still abstract, it does not store on
any particular filesystem, but is fairly well suited to the Amazon S3
or Apache Hadoop HDFS. Storing packs directly on HDFS rather than
HBase removes a layer of abstraction, as most HBase row reads turn
into an HDFS read.

Most of the DFS code in this change was blatently copied from the
local filesystem code. Most parts should be refactored to be shared
between the two storage systems, but right now I am hesistent to do
this due to how well tuned the local filesystem code currently is.

Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
2011-11-04 11:08:20 -07:00
Shawn Pearce 1783c8a831 Merge "Allow '\' in user names in URI-ish" 2011-11-04 13:10:47 -04:00
Robin Rosenberg afd4f3b0cf Allow '\' in user names in URI-ish
Actually this is not ok according to the RFC, but this implementation is
ment to be Git compatible. A '\' is needed when the authentication
requires or allows authentication to a Windows domain where the
user name can be specified as DOMAIN\user.

Change-Id: If02f258c032486f1afd2e09592a3c7069942eb8b
2011-11-04 17:54:43 +01:00
Shawn Pearce ede88c60a5 Merge "Provide an id for submodule entries." 2011-11-04 10:14:46 -04:00
Carl Myers 85a9ab7410 Fix NPE when PATH environment variable is empty
Change-Id: Ic27d509cd5e2d6c855e7d355fc308399d9dc01c9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-11-04 14:42:12 +01:00
Kevin Sawicki 931b931ee8 Provide an id for submodule entries.
Open a repository for submodule entries that have a child .git
directory and use the resolved HEAD commit as the entry's id.

Change-Id: I68d6e127f018b24ee865865a2dd3011a0e21453c
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-11-04 08:14:53 +01:00
Shawn Pearce 2efbcb7e44 Merge "Implement Config.Entry.toString() to help debugging" 2011-11-03 16:19:17 -04:00
Shawn Pearce c2e828abd6 Merge "DirCacheEntry: accessors for cached creation time (CTIME)" 2011-11-03 16:18:43 -04:00
Kevin Sawicki 5041f738e9 Suppress unused and unchecked warnings
Change-Id: I9f51cc749f5cb9d2e3aa86874e60fca29b779565
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-11-03 11:03:01 +01:00
Marc Strapetz bf81119e62 DirCacheEntry: accessors for cached creation time (CTIME)
Change-Id: I986d5fff63ff1a86cca6bab49c744ea673fe4892
2011-10-31 17:45:51 +01:00
Shawn O. Pearce 9f81c6813a Merge "Ensure the ObjectInserter flushes after a merge" 2011-10-28 21:14:26 -04:00
Robin Rosenberg 3ceb4fac23 Do not resolve path using cygwin unless told to
The system property jgit.cygpath must be set to true in order
for cygwin's cygpath to be used to translate path from cygwin
namespace to Windows namespace.

The cygwin path translation should be considered deprecated.

Bug: 353389
Change-Id: I2b5234c0ab936dac67d1e232f4cd28331bf3226d
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-10-28 14:58:32 +02:00
Matthias Sohn a5f72d6b3b Implement Config.Entry.toString() to help debugging
Change-Id: I86f6359d955d39ab033848b87ed39d20378d3c1f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-10-27 22:55:59 +02:00
Shawn Pearce b6281cac01 Merge "Enable full Transport configuration for JGit API commands" 2011-10-27 10:25:18 -04:00
Shawn O. Pearce b24a61272a Ensure the ObjectInserter flushes after a merge
If this does not happen some databases may discard
objects and not make them available.

Change-Id: I347b3c3724db52c8a6c09f4804071497a3a377ab
2011-10-26 20:34:52 -07:00
Matthias Sohn 34f678643c Merge changes I488e9c97,I30f1049f,I1c088dce
* changes:
  Cosmetic adjustment of relative date format, do not display "0 months"
  Make use of the many date formatting options in the log command
  Define a utility class for handling Git date formats
2011-10-26 17:29:23 -04:00
Robin Rosenberg 57bdb04873 Cosmetic adjustment of relative date format, do not display "0 months"
Though it may seem less precise, "0 months" looks bad and the reference
Git implementation also does not display "0 months"

Change-Id: I488e9c97656f9941788ae88d7c5c1562ab6c26f0
2011-10-26 23:15:28 +02:00
Matthias Sohn 66cb4ac902 Merge "Allow detecting which files were renamed during a revwalk" 2011-10-26 16:18:21 -04:00
Carsten Pfeiffer 98d4bd6d36 Allow detecting which files were renamed during a revwalk
The egit history view shows the files associated with a commit by using
a PathFilter. When following renames with a FollowFilter, the PathFilter
cannot be configured anymore because the affected files are simply not
known.

Thus, it should be possible to get to know which files are renamed.

Bug: 302549
Change-Id: I4761e9f5cfb4f0ef0b0e1e38991401a1d5003bea
2011-10-25 09:22:11 +02:00
Robin Rosenberg 63bb6ff06c Fix compatibilty breakage for SystemReader
Introducing a new abstract method is not nice when one
expects other to subclass them. Create default implementations
so old code that implements SystemReader does not break.
The default methods just delegate to the JVM.

Change-Id: I42cdfdcb6b29f7203697a23833dca85185b0b9b3
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-10-23 22:53:17 +02:00
Robin Rosenberg f4460dda97 Define a utility class for handling Git date formats
Besides the formats known by git-log(1) we also add "locale" 
and "localelocal" that formats dates according to the user's locale.
"locale" does not translate into local timezone, while
localelocal does.

Change-Id: I1c088dcec992c107e43f6c17be4ac9ed6eb428bf
2011-10-23 01:51:30 +02:00
Robin Rosenberg 3a4fa52723 Add locale to the properties manageable by SystemReader
Change-Id: I5e9af40d38bb671cb9fcdb0fa3b4eb3af5f36f6c
2011-10-20 23:49:52 +02:00
Robin Rosenberg 06b183f9b7 Add a method to SystemReader to get the time zone
Change-Id: Ifd31f408ed2c5b7869694b715fea3219e74963ef
2011-10-20 23:49:51 +02:00
Robin Rosenberg fb68c7a4cd Use the SystemReader to get system time
Change-Id: Ib79c0cc964bfe799b204419e552b9aa6243966ce
2011-10-20 23:49:51 +02:00
Robin Rosenberg 2e43dcd645 Fix bad checkout behaviour when a file is removed
We deleted the entry if there was a file and an index
entry, but not when there was just an index entry. Now
delete the file in both cases since the missing file
just means our worktree is dirty. This affected the
implementation of reset --hard.

Bug: 347574
Change-Id: Ie66fa61303472422830f5e33614e93ad65094e5d
2011-10-18 22:32:47 +02:00
Kevin Sawicki 86e96b41e2 Correct typo in RevWalk.parseBody comment
Change-Id: I0e65a5a6809a8d32d256322dbcae94b6aa603e5e
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-10-14 08:49:17 -07:00
Christian Halstrick a66dea1451 Merge "Extend IndexDiff to calculate ignored files and folders" 2011-10-10 09:01:49 -04:00
Shawn O. Pearce cc03e27093 Merge changes I7cdb563b,I7f60ae68,I7bd1e769,I92683805,I0e51a8e6
* changes:
  UploadPack: Fix races in smart HTTP negotiation
  PackWriter: Export more statistics
  Do not requeue state vector in stateless RPC fetch
  Wrap excessively long line in BasePackFetchConnection
  Fix smart HTTP client stream alignment errors
2011-10-07 16:00:21 -04:00
Jens Baumgart 6befabcb15 Extend IndexDiff to calculate ignored files and folders
IndexDiff was extended to calculate ignored files and folders.
The calculation only considers files that are NOT in the index.
This functionality is required by the new EGit decorator implementation.

Bug: 359264
Change-Id: I8f09d6a4d61b64aeea80fd22bf3a2963c2bca347
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2011-10-05 13:56:23 +02:00
Christian Halstrick f99ce8d6ff Merge "Fix DirCacheEdtor.DeleteTree for empty string argument" 2011-10-05 06:21:01 -04:00
Kevin Sawicki 565e4a06ef Add missing comment text for mergeCommitTree parameter
Change-Id: I35cef13d8be4f06515668f710fd508700b90f44d
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-10-04 11:39:51 +02:00
Robin Rosenberg 602c869d7a Do not attempt to resolve describe-labels with less than four digits
Change-Id: I21dcd3cca3b41102fd898238d8d640dea25e0caf
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-10-03 00:39:50 +02:00
Robin Rosenberg 1570aa9e5c Fix DirCacheEdtor.DeleteTree for empty string argument
Change-Id: I7425da91c0752ae82484e3c29d21b57402d30c61
2011-10-01 16:35:00 +02:00
Kevin Sawicki 654f7235ec Add varargs version of PathFilterGroup.createFromStrings
This allows the following usage pattern:

  PathFilterGroup.createFromStrings("path1", "path2");

Change-Id: I589e758cc55873ce75614602e017ac793435e24d
Signed-off-by: Kevin Sawicki <kevin@github.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-09-30 15:00:53 -07:00
Manuel Doninger 458b5a4042 New config constant for default start-point
This constant determine the default start-point, if the user
don't want to create a branch from the current HEAD.

Change-Id: Iea944e11e80134fbafc4c47383457d5ed11a4164
Signed-off-by: Manuel Doninger <manuel.doninger@googlemail.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-09-30 14:42:44 -07:00
Matthias Sohn 0db0476542 Fire IndexChangedEvent on DirCache.commit()
Since we replaced GitIndex by DirCache JGit didn't fire
IndexChangedEvents anymore. For EGit this still worked with a high
latency since its RepositoryChangeScanner which is scheduled to
run each 10 seconds fires the event in case the index changes.
This scanner is meant to detect index changes induced by a different
process e.g. by calling "git add" from native git.

When the index is changed from within the same process we should fire
the event synchronously. Compare the index checksum on write to index
checksum when index was read earlier to determine if index really
changed. Use IndexChangedListener interface to keep DirCache decoupled
from Repository.

Change-Id: Id4311f7a7859ffe8738863b3d86c83c8b5f513af
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-30 00:00:22 +02:00
Kevin Sawicki e630f91305 Remove TODO for generated constructor.
Change-Id: Ie405f6de99b8fa632d7462400e647a37f30e2e31
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-09-28 14:10:24 -07:00
Christian Halstrick 1230d353d8 Fix status in index entries after checkout of paths
The checkout command was producing an inconsistent state of the index
which even confuses native git. The content sha1 of the touched index
entries was updated, but the length and the filemode was not updated.
Later in coding the index entries got automatically corrected (through
Dircache.checkoutEntry()) but the correction was after persisting the
index to disk. So, the correction was lost and we ended up with an index
where length and sha1 don't fit together.
A similar problem is fixed with "lastModified" of DircacheEntry. When
checking out a path without specifying an explicit commit (you want to
checkout what's in the index) the index was not updated regarding
lastModified. Readers of the index will think the checked-out
file is dirty because the file has a younger lastmodified then what's
in the index.

Change-Id: Ifc6d806fbf96f53c94d9ded0befcc932d943aa04
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Bug: 355205
2011-09-28 12:34:32 +02:00
Christian Halstrick d306c35cea Merge "Fix DirCache,getEntriesWithin for empty string arguemnt" 2011-09-26 04:27:54 -04:00
Robin Rosenberg 23bba6fb48 Merge "Remove duplicate calls to DirCache.unlock on checkout" 2011-09-25 04:44:45 -04:00
Robin Rosenberg b4112c1748 Fix DirCache,getEntriesWithin for empty string arguemnt
Change-Id: I0bea130df611de3ef8c9251093b11c62b5442cd1
2011-09-25 00:17:19 +02:00
Robin Rosenberg a3d7056b45 Merge "Remove use of GitIndex to detect index changes" 2011-09-22 12:24:37 -04:00
Robin Rosenberg 39ad503fcb Append merge strategy to reflog message
Change-Id: Ia0e73208b86c45a3d96698e973f6e70ec5cb7303
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-21 23:36:46 +02:00
Robin Rosenberg 4f4e468f6f Fix the reflog prefix for cherry-pick, revert and merge commands
We should see whether the commit was a regular commit or something
else.

Change-Id: I82d8300cf3c53cb2bdcb6495386aadb803e0c6f7
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-21 23:36:26 +02:00
Roberto Tyley 791a9fd691 Enable full Transport configuration for JGit API commands
Add a TransportConfigCallback parameter to JGit API commands, to allow
consumers of the JGit command API to perform custom Transport configuration
that would be otherwise difficult to anticipate & expose on the API command
builders.


My specific use-case is configuring additional properties on SshTransport
- I need to take over the SshSessionFactory used by the transport. Using
TransportConfigCallback I can simply do this (rather than reimplement the
API command classes):

public void configure(Transport tn) {
  if (tn instanceof SshTransport) {
    ((SshTransport) tn).setSshSessionFactory(factoryProvider.get());
  }
}

Adding an explicit setSshSessionFactory() method to the JGit command
classes would bloat the API. Also, creating the replacement
SshSessionFactory is unnecessary if the transport is not SSH, but the type
of the Transport is only known once the remote has been resolved and the
URI parsed - consequently it makes sense to perform this step in a
callback, where the transport instance can be inspected to determine if
it's of a relevant type.


A note about where this leaves the API - there are now 4 commands:

CloneCommand
PullCommand
FetchCommand
PushCommand

-that share 3 identical transport-related parameters:

timeout
credentialsProvider
transportConfigurator

I think there's potential for introducing an interface or val-object to
identify/encapsulate this repetition, which I'd be happy to do in a
subsequent commit.

Change-Id: I8983c3627cdd7d7b2aeb0b6a3dadee553378b951
Signed-off-by: Roberto Tyley <roberto.tyley@gmail.com>
2011-09-16 16:04:35 +01:00
Matthias Sohn 46771e9e88 Remove use of GitIndex to detect index changes
We can detect index changes using FileSnapshot. This is more efficient
and removes usage of a deprecated class.

Change-Id: I4a679102c9a1bd8e82b9ca93eb9dbbde445e9be4
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-16 00:19:39 +02:00
Matthias Sohn 19a366d532 Prepare 1.2.0 builds
Change-Id: I9ec247135d93ef28d732e94f18d0ec1d0e2e6d44
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-15 22:51:46 +02:00
Shawn O. Pearce 01888db892 UploadPack: Fix races in smart HTTP negotiation
Clients cache the set of advertised references at the start of a
negotiation, and keep replaying the same "want SHA1" list to the
server on each negotiation step.  If another client pushes into
a branch and moves it by fast-forward, any request to obtain that
branch's prior SHA-1 is still valid, the commit is reachable from
the new position of the reference.  Unfortunately the fast-forward
causes smart HTTP negotations to fail, as the server no longer is
advertising that prior SHA-1.

Instead of causing clients to fail out with a "want invalid" error
and forcing the end-user retry, possibly getting into a never ending
try-fail-retry race while other clients are pushing into the same
busy repository, allow the slightly stale want request so long as
it is still reachable.

C Git implemented this same change recently to fix races on the
smart HTTP protocol when the C Git git-http-backend is used.

The new RequestPolicy feature also allows server authors to make
an even more lenient configuration that exports any SHA-1 to the
client. This might be useful in certain settings where a server
has authenticated the client as the "repository owner" and wants
to allow them to grab any content from the server as a complete
unbroken history chain.

The new setAdvertisedRefs() method allows server authors to manually
fix the references that are advertised, possibly bypassing the
getAllRefs() call on the Repository object.

Change-Id: I7cdb563bf9c55c83653f217f6e53c3add55a0541
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-09-14 15:34:55 -07:00
Shawn O. Pearce 1b6a549ff3 PackWriter: Export more statistics
Export the shallow pack information, and also a handy function to
sum up the total times.  Include the time writing out the index file,
if it was created.

Change-Id: I7f60ae6848455a357b25feedb23743bbf6c153cf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-09-14 15:34:55 -07:00
Shawn O. Pearce 38b3816d65 Do not requeue state vector in stateless RPC fetch
If the no-done capability was enabled on the connection, don't
queue up the state vector again once the ACK %s ready message
is observed from the remote. The pack will be following in this
response stream, so the state vector is no longer required.

Change-Id: I7bd1e76957cb58c7ff1cdaeef227f1b02a7e5d24
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-09-14 15:34:55 -07:00
Shawn O. Pearce 575a80ac44 Wrap excessively long line in BasePackFetchConnection
Change-Id: I926838058c1de2146e22faa08570406600457acb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-09-14 15:34:55 -07:00
Shawn O. Pearce c1a9b2ae8b Fix smart HTTP client stream alignment errors
The client's use of UnionInputStream was broken when combined with a
8192 byte buffer used by PackParser. A smart HTTP client connection
always pushes in the execute stateless RPC input stream after the
data stream has ended from the remote peer. At the end of the pack,
PackParser asked to fill a 8192 byte buffer, but if only e.g. 1000
bytes remained UnionInputStream went to the next stream and asked
it for input, which triggered a new RPC, and failed because there
was nothing pending in the request buffer.

Change UnionInputStream to only return what it consumed from a
single InputStream without invoking the next InputStream, just in
case that second InputStream happens to be one of these magical
ones that generates an RPC invocation.

Change-Id: I0e51a8e6fea1647e4d2e08ac9cfc69c2945ce4cb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-09-14 15:34:55 -07:00
Kevin Sawicki 4005f3c693 Remove duplicate calls to DirCache.unlock on checkout
Calls to unlock the DirCache before throwing an exception
were not needed since checkout calls doCheckout wrapped
in a try block that calls DirCache.unlock in a finally
block.

Change-Id: I2b249a784f9e363430e288aad67fcefb7fac0a6e
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-09-13 15:29:55 -07:00
Matthias Sohn cc4e6109e4 Merge branch 'stable-1.1'
* stable-1.1:
  Allow commit when submodule changes are present
  Ignore submodule on checkout instead of deleting it
  cleanup: Reuse local variable for current DirCacheEntry
  Prepare post v1.1.0.201109071825-rc3 builds
  JGit v1.1.0.201109071825-rc3
  Use commit message best practices for Mylyn Commit template

Change-Id: I6ab9e5cb48c036d2ee2e548f5ec040d93672d8ad
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-11 22:43:41 +02:00
Robin Rosenberg a7d3c68015 Allow commit when submodule changes are present
We do not yet check or validate submodules, but can accept that
someone staged a change in a submodule with other tools.

Change-Id: I642ede382314bfbd1892dd509a2222885cc5350a
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-09-08 16:46:20 +02:00
Robin Rosenberg 576abf64d1 Ignore submodule on checkout instead of deleting it
The purpose of this commit is to prevent destruction of
submodules on checkout from a tree with a submodule to
another. For consistency we handle the reverse case too,
when we checkout a branch that has a submodule and the
submodule directory exists. And finally we ignore the
case where the submodule changes.

We do not update the submodules, we just try to ignore
them harder.

Bug: 356664
Change-Id: I202c695a57af99b13d0d7220803fd08def3d9b5e
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-09-08 16:46:20 +02:00
Robin Rosenberg 2bb8da0405 cleanup: Reuse local variable for current DirCacheEntry
Since we already have assigned i.getDirCacheEntry() to dce,
use dce instead.

Change-Id: I107713ad0b356516d75c29203f945b056bad3ac7
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-09-08 16:46:19 +02:00
Matthias Sohn b09d21b6eb Prepare post v1.1.0.201109071825-rc3 builds
Change-Id: I1244f6639263d156a6f9e4530167e5eb1826a535
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-08 01:50:41 +02:00
Matthias Sohn 75611a8314 JGit v1.1.0.201109071825-rc3
Change-Id: I1b989d3101272632eacabe25a0b111ad0ff5bb3b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-08 00:54:27 +02:00
Dariusz Luksza 570d862ef3 Fix IOOBE in Repository.resolveSimple()
IndexOutOfBoundException is thrown from Repository.resolveSimple() when
'-g' string is located less then 4 characters from the end of this
string.

Change-Id: I1128c2cdfec9db3023d4d0f1f40d863e84b75950
Signed-off-by: Dariusz Luksza <dariusz@luksza.org>
2011-09-06 10:12:39 +02:00
Matthias Sohn cfdb09e9db Use commit message best practices for Mylyn Commit template
We should use a template for Mylyn commit messages that matches with our
guidelines for commit messages.

http://wiki.eclipse.org/EGit/Contributor_Guide#Commit_message_guidelines

Bug: 337401
Change-Id: I05812abf0eb0651d22c439142640f173fc2f2ba0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-05 23:57:21 +02:00
Robin Rosenberg b695f66487 Fix the names in the reflog for checkouts
We were diverging from the reference implementation. Always use the
ref we checkout to as the to-branch the reflog and avoid the
refs/heads both in the from-name and to-name.

Change-Id: Id973d9102593872e4df41d0788f0eb7c7fd130c4
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-05 17:02:16 +02:00
Robin Rosenberg eadc26c0a0 Add a helper for parsing branch switch info out of a reflog entry
Change-Id: I91c7e08c4afd2562df2226887a933d93c78a0371
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-05 17:01:56 +02:00
Kevin Sawicki d110fbc300 [blame] Reset rename detector before computing renames.
Bug: 354507
Change-Id: I5e9c65a082d9dee1e87536c5cf2a8de75efa6a33
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-09-01 13:29:43 -07:00
Matthias Sohn df117d3da9 Prepare post-v1.1.0.201109011030-rc2 builds
Change-Id: I8dda83cdbe88beba4a480df9846848bf3aceb9e2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-01 17:36:10 +02:00
Matthias Sohn 384ffa7ee9 JGit v1.1.0.201109011030-rc2
Change-Id: Ie6d65fe45ad92c813ce3a227729aa43681922249
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-09-01 16:38:13 +02:00
Roberto Tyley 602b2f0d45 Tolerate zlib deflation with window size < 32Kb
JGit currently identifies loose objects as 'corrupt' if they've been
deflated using a window size less than 32Kb, because the
isStandardFormat() function doesn't recognise the header
byte as a zlib header. This patch makes the method tolerant of
all valid window sizes (15-bit to 8-bit) - but doesn't sacrifice
it's accuracy in distingushing the standard loose-object format
from the experimental (now abandoned) format. It's based on a patch
which has been merged into C-Git master branch:

https://github.com/git/git/commit/7f684a2aff636f44a506

On memory constrained systems zlib may use a much smaller window
size - working on Agit, I found that Android uses a 4KB window;
giving a header byte of 0x48, not 0x78. Consequently all loose
objects generated by the Android platform appear 'corrupt' :(

It might appear that this patch changes isStandardFormat() to the
point where it could incorrectly identify the experimental format as
the standard one, but the two criteria (bitmask & checksum) can only
give a false result for an experimental object where both of the
following are true:

1) object size is exactly 8 bytes when uncompressed (bitmask)
2) [single-byte in-pack git type&size header] * 256
   + [1st byte of the following zlib header] % 31 = 0 (checksum)

As it happens, for all possible combinations of valid object type
(1-4) and window bits (0-7), the only time when the checksum will be
divisible by 31 is for 0x1838 - ie object type *1*, a Commit - which,
due the fields all Commit objects must contain, could never be as
small as 8 bytes in size.

Given this, the combination of the two criteria (bitmask & checksum)
always correctly determines the buffer format, and is more tolerant
than the previous version.

References:

Android uses a 4KB window for deflation:
http://android.git.kernel.org/?p=platform/libcore.git;a=blob;f=luni/src/main/native/java_util_zip_Deflater.cpp;h=c0b2feff196e63a7b85d97cf9ae5bb2583409c28;hb=refs/heads/gingerbread#l53

Code snippet searching for false positives with the zlib checksum:
https://gist.github.com/1118177

Change-Id: Ifd84cd2bd6b46f087c9984fb4cbd8309f483dec0
2011-08-25 22:25:10 +01:00
Stefan Lay 28a4992524 Merge "Unwind loop that iterates over fetch connection refs." 2011-08-25 11:09:52 -04:00
Tomasz Zarna c35c23db8d Use JGitText.refAlreadyExists instead of "ref exists"
Change-Id: I113bcf82c6292db5269271f799d09c80acc40bcd
2011-08-24 10:17:32 +02:00
Kevin Sawicki b127fa19eb Unwind loop that iterates over fetch connection refs.
Create separate loops based on whether the ref specs
collection is empty or not.

Change-Id: I2861b45fe083675e12ff47f591e104f8cf6dd216
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-08-22 13:10:31 -07:00
Christian Halstrick 930875a81a Throw JGit exception when ResetCommand got wrong ref
If the ResetCommand should reset to a invalid ref (e.g. HEAD in a repo
whithout a single commit) it was throwing an NPE. This is fixed now by
throwing a JGitInternalExcpeption. It would be nicer if we could throw
a InvalidRefException, but this would modify our API.

Bug: 339610
Change-Id: Iffcb4f2cca9f702176471d93c3a71e5cb3e700b1
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-08-21 14:11:00 -07:00
Matt Fischer 9952223e06 Implement server support for shallow clones
This implements the server side of shallow clones only (i.e.
git-upload-pack), not the client side.

CQ: 5517
Bug: 301627
Change-Id: Ied5f501f9c8d1fe90ab2ba44fac5fa67ed0035a4
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-08-21 14:04:23 -07:00
Shawn O. Pearce a1a8c6d77e PackWriter: support excluding objects already in other packs
This can be useful when implementing garbage collection and there
are packs that should not be copied, such as huge packs that have
a sibling ".keep" file alongside of them.

Callers driving PackWriter need to initialize the list of packs not
to include objects from by passing each index to excludeObjects().

Change-Id: Id7f34df69df97be406bcae184308e92b0e8690fd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-08-21 13:59:26 -07:00
Denys Digtiar c580c56c4d Fix ClassCastException in MergeCommand
Test was added which reproduce the ClassCastException when ours or
theirs merge strategy is set to MergeCommand. Merger and MergeCommand
were updated in order to avoid exception.

Change-Id: I4c1284b4e80d82638d0677a05e5d38182526d196
Signed-off-by: Denys Digtiar <duemir@gmail.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-08-21 13:53:02 -07:00
Ketan Padegaonkar e38cf2078d Add ListTagCommand to JGit API
Bug: 355246
Change-Id: I11e019f3c19b4340ac7160ac8fcbadd52499d322
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-08-21 13:43:50 -07:00
Chris Aniszczyk bd691f162c Merge "Add DeleteTagCommand to JGit API" 2011-08-21 16:24:23 -04:00
Tomasz Zarna 5f787bfd62 Add DeleteTagCommand to JGit API
Bug: 353226
Change-Id: I54ae237cab792742333a249eb5a774d5e1775af8
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-08-21 13:18:52 -07:00
Christian Halstrick 12bb76380f Merge "PackWriter: Make want/have actual sets" 2011-08-18 03:43:38 -04:00
Kevin Sawicki a7aaf88b2e Use HEAD as default ref for RefLogCommand.
This mirrors the default command-line behavior.

Change-Id: I4f819410fa6df3064c560beb3184b61fd7bb1f15
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-08-17 08:56:16 -07:00
Matthias Sohn 148595fb54 Merge "Adds DiffEntry.scan(TreeWalk, boolean) method" 2011-08-17 07:31:54 -04:00
Dariusz Luksza 679cab9b32 Adds DiffEntry.scan(TreeWalk, boolean) method
Adds method into DiffEntry class that allows to specify whether changed
trees are included in scanning result list. By default changed trees
aren't added, but in some cases having changed tree would be useful.

Also adds check for tree count in TreeWalk and when it is different from
two it will thrown an IllegalArgumentException.

This change is required by egit
I7ddb21e7ff54333dd6d7ace3209bbcf83da2b219

Change-Id: I5a680a73e1cffa18ade3402cc86008f46c1da1f1
Signed-off-by: Dariusz Luksza <dariusz@luksza.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-08-17 12:43:35 +02:00
Kevin Sawicki ac909ec89d Fix typo in IndexDiff.getModified() comment.
Removes an extra instance of the word 'on'.

Change-Id: Ie5f137f0dda440f5879f6d5c62ebce0431530ad7
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-08-16 13:22:42 -07:00
Shawn O. Pearce 74333e63b6 PackWriter: Make want/have actual sets
During parsing these are used with contains(). If they are a List
type, the contains operation is not efficient. Some callers such
as UploadPack often pass a List here, so convert to Set when the
type isn't efficient for contains().

Change-Id: If948ae3bf1f46e756bd2d5db14795e12ba7a6207
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-08-16 12:18:39 -07:00
Christian Halstrick 100e9429b5 Merge "Add DiffCommand to JGit API" 2011-08-16 08:25:46 -04:00
Tomasz Zarna 714a3ee151 Add DiffCommand to JGit API
Bug: 334766
Change-Id: Iea74c599a956a058608e424d0274f879bc2f064a
2011-08-16 11:47:55 +02:00
Shawn O. Pearce 2610eaf386 Revert "PackWriter: Do not delta compress already packed objects"
This reverts commit 67b064fc9f.

The "tiny optimization" introduced by 67b0 turns out to have a big
savings on wall-clock time when the object store is very slow (e.g.
the DHT support in JGit), but comes with a much bigger penalty in
space used by the output stream.  CGit packed with 67b0 enabled is
7 MiB larger than it should be (36 MiB rather than 28/29 MiB).  The
much bigger Linux kernel repository gained over 200 MiB, though some
of this may have been caused by a smaller window setting.

Revert this patch as PackWriter should be optimizing for space used
rather than time spent, since its primary use is network transfer, and
that isn't free.

Change-Id: I7413a9ef89762208159b4a1adc5a22a4c9245611
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-08-13 17:01:29 -07:00
Shawn O. Pearce 53db854185 Speed up ObjectWalk by 6235 objects/sec
The "Counting objects" phase of packing is the most time consuming
part for any server providing access to Git repositories. Scanning
through the entire project history, including every revision of
every tree that has ever existed is expensive and takes an incredible
amount of CPU time.

Inline the tree parsing logic, unroll a number of loops, and setup
to better handle the common case of seeing another occurrence of
an object that was already marked SEEN.

This change boosts the "Counting objects" phase when JGit is acting
as a server and is packing the linux-2.6 repository for its client.
Compared to CGit on the same hardware, a JGit daemon server is now
21883 objects/sec faster:

CGit:
  Counted 2058062 objects in 38981 ms at 52796.54 objects/sec
  Counted 2058062 objects in 38920 ms at 52879.29 objects/sec
  Counted 2058062 objects in 39059 ms at 52691.11 objects/sec

JGit (before):
  Counted 2058062 objects in 31529 ms at 65275.21 objects/sec
  Counted 2058062 objects in 30359 ms at 67790.84 objects/sec
  Counted 2058062 objects in 30033 ms at 68526.69 objects/sec

JGit (this commit):
  Counted 2058062 objects in 28726 ms at 71644.57 objects/sec
  Counted 2058062 objects in 27652 ms at 74427.24 objects/sec
  Counted 2058062 objects in 27528 ms at 74762.50 objects/sec

Above the first run was a "cold server". For JGit the JVM had just
started up with `jgit daemon`, and for CGit we hadn't touched the
repository "recently" (but it was certainly in kernel buffer cache).
The second and third runs were against the running JGit JVM, allowing
timing tests to better reflect the benefits of JGit's pack and index
caching, as well as any optimizations the JIT may have performed.

The timings are fair.  CGit is opening, checking and mmap'ing both
the pack and index during the timer.  JGit is opening, checking
and malloc+read'ing the pack and index data into its Java heap
during the timer. Both processes are walking the same graph space,
and are computing the "path hash" necessary to sort objects in the
object table for delta compression.  Since this commit only impacts
the "Counting objects" phase, delta compression was obviously not
included in the timings and JGit may still be performing delta
compression slower than CGit, resulting in an overall slower server
experience for clients.

Change-Id: Ieb184bfaed8475d6960a494b1f3c870e0382164a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-08-13 14:53:18 -07:00
Shawn O. Pearce 82e735c3aa Merge "PackWriter: Only search for base objects on thin packs" 2011-08-13 17:33:18 -04:00
Shawn Pearce a2eadf0b67 Merge "Ignore missing MERGE_MSG when deleting MERGE_MSG" 2011-08-13 17:08:40 -04:00
Shawn Pearce ac9d9ee96d Merge "Correct comment on CloneCommand.setRemote method." 2011-08-13 17:00:29 -04:00
Robin Rosenberg 5a3a2b53ee Merge "Fix reading of ref names containing characters that sort before /" 2011-08-10 14:40:25 -04:00
Robin Rosenberg d4db26bb57 Merge "Fix offset64 creation for objects at 2 GiB" 2011-08-10 14:39:30 -04:00
Robin Stocker 8db5b6a013 Add isSuccessful to MergeStatus, RebaseResult.Status and PullResult
This is useful when the result needs to be displayed and it's only of
interest if the operation was successful or not (in egit, it could be
used in MultiPullResultDialog).

Change-Id: Icfc9a9c76763f8a777087a1262c8d6ad251a9068
Signed-off-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-08-09 23:31:50 +02:00
Shawn O. Pearce 3bac5b1d7d Fix offset64 creation for objects at 2 GiB
The offset32 format is used for objects <= 2^31-1, while the offset64
format is used for all other objects.  This condition was missing
the = needed to ensure an object placed exactly at 2^31 would have
its 64 bit offset in the index.

Change-Id: I293fac0e829c9baa12cb59411dffde666051d6c5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-08-08 16:36:56 -07:00
Shawn O. Pearce 99e6cfb131 PackWriter: Only search for base objects on thin packs
A non-thin pack does not need to worry about preferred bases, the pack
will be self-contained and all required delta base objects will appear
within the pack itself. Obtaining the path buffer and length from the
ObjectWalk to build the preferred base table is "expensive", so avoid
the cost unless a thin pack is being constructed.

Change-Id: I16e30cd864f4189d4304e7957a7cd5bdb9e84528
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-08-08 15:10:11 -07:00
Shawn O. Pearce 86ecf141b6 Merge changes I58110f17,I440baa64,Ic77dcac5
* changes:
  PackWriter: Skip progress messages on fast operations
  IndexPack: Defer the "Resolving deltas" progress meter
  IndexPack: Fix "Resolving deltas" progress meter
2011-08-05 21:21:10 -04:00
Kevin Sawicki 4ae9d47130 Correct comment on CloneCommand.setRemote method.
The previous comment stated that the value set was used
to keep track of the branch in the remote repository
which was incorrect.

Updated the method comment to match the format used
for the PushCommand.setRemote and FetchCommand.setRemote
methods.

Change-Id: I11b81eb3125958af29247b485da56fd88c3bfdf5
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-08-05 11:02:10 -07:00