Commit Graph

1711 Commits

Author SHA1 Message Date
Matthias Sohn c0780bcb99 [findBugs] Silence returning null for StringUtils.toBooleanOrNull()
As the method name and its javadoc clearly state that this method can
return null we can ignore this FindBugs warning.

Change-Id: I366435e26eda5d910f5d1a907db51f08efd4bb8c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-11-16 20:55:34 +01:00
Matthias Sohn afebe7880d [findBugs] Prefer short-cut logic as it's more performant
Change-Id: I64577f8fd19ee0d2d407479cc70e521adc367f37
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-11-16 20:54:40 +01:00
Christian Halstrick e7ec5e1473 Merge "Implement DirCacheEntry.toString() to ease debugging" 2011-11-11 03:21:29 -05:00
Robin Rosenberg ebd0a3af54 Clean up tab usage in Directory/File conflict table
Change-Id: I394fc1ef714c8465cbd5af9c73338b9a324ad9c4
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
2011-11-10 11:29:08 -08:00
Jens Baumgart 53ef3e5114 Add detection of untracked folders to IndexDiffFilter
Decorators need to know whether folders in the working tree contain only
untracked files. This change enhances IndexDiffFilter to report such
folders. This works only together with treewalks which operate in
default traversal mode. For treewalks which process entries in
postorder mode (files are walked before their parent folder is walked)
this detection doesn't work.

Bug: 359264
Change-Id: I9298d1e3ccac0aec8bbd4e8ac867bc06a5c89c9f
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
2011-11-10 11:15:26 -08:00
Carsten Pfeiffer 92752f6b50 [blame] Fix blame following renames in non-toplevel directories
Mark the treeWalk as recursive; otherwise following renames only works
for toplevel files.

Bug: 302549
Change-Id: I70867928eadf332b0942f8bf6877a3acb3828c87
Signed-off-by: Carsten Pfeiffer <carsten.pfeiffer@gebit.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
2011-11-10 11:06:03 -08:00
Kevin Sawicki da901c4968 Support a configured credentials provider in LsRemoteCommand
Refactored the three common transport configuration options:
credentials provider, timeout, and transport config callback
into a new TransportCommand base class which is now extended
by all commands that use a Transport object during execution.

Bug: 349188
Change-Id: I90c2c14fb4e3cc4712905158f9047153a0c235c2
Signed-off-by: Kevin Sawicki <kevin@github.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
2011-11-10 10:57:47 -08:00
Matthias Sohn 899a3ccf6d Implement DirCacheEntry.toString() to ease debugging
Change-Id: I9aa1b5817a18fb340411f47b25b6711d533590fd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-11-10 13:34:59 +01:00
Robin Rosenberg c0392381ee Merge changes Ibb3467f7,I2af99903
* changes:
  Always use try/finally around DfsBlockCache.clockLock
  DfsBlockCache: Fix NPE when evicting empty cell
2011-11-10 02:07:04 -05:00
Christian Halstrick 45c714456b Merge "Do not use the deprecated Tree class internally" 2011-11-09 03:40:46 -05:00
Robin Rosenberg a1c614433c Do not use the deprecated Tree class internally
Replace it with DirCache, like we did to remove GitIndex.

Change-Id: Ia354770cee5c68f19945279b34aef6de54697435
2011-11-09 09:30:54 +01:00
Robin Rosenberg 6e9fdce9b9 Kill GitIndex
A few places were still using GitIndex. Replacing it was fairly
simple, but there is a difference in test outcome in
ReadTreeTest.testUntrackedConflicts. I believe the new behavior
is good, since we do not update neither the index, not the worktree.

Change-Id: I4be5357b7b3139dded17f77e07a140addb213ea7
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-11-09 09:16:50 +01:00
Robin Rosenberg 83c172f0f7 Deprecate GitIndex more by using only DirCache internally.
This includes merging ReadTreeTest into DirCacheCheckoutTest and
converting IndexDiffTest to use DirCache only. The GitIndex specific
T0007GitIndex test remains.

GitIndex is deprecated. Let us speed up its demise by focusing the
DirCacheCheckout tests to using DirCache instead.

This also add explicit deprecation comments to methods that depend
on GitIndex in Repository and TreeEntry. The latter is deprecated in
itself.

Change-Id: Id89262f7fbfee07871f444378f196ded444f2783
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-11-09 09:05:24 +01:00
Shawn O. Pearce 9652f16a47 Always use try/finally around DfsBlockCache.clockLock
Any RuntimeException or Error in this block will leave the lock
held by the caller thread, which can later result in deadlock or
just cache requests hanging forever because they cannot get to
the lock object.

Wrap everything in try/finally to prevent the lock from hanging,
even though a RuntimeException or Error should never happen in
any of these code paths.

Change-Id: Ibb3467f7ee4c06f617b737858b4be17b10d936e0
2011-11-08 12:24:30 -08:00
Shawn O. Pearce a6677ef28a DfsBlockCache: Fix NPE when evicting empty cell
The cache starts with a single empty Ref that has no data, as the
clock list does not support being empty. When this Ref is removed,
the size has to be decremented from the associated DfsPackKey,
which was previously null. Make it always be non-null.

Change-Id: I2af99903e8039405ea6d67f383576ffa43839cff
2011-11-08 12:23:44 -08:00
Robin Rosenberg 790ddb2983 Don't throw away the stack trace when tests fail
Most unexpected exceptions are completely useless yielding message
like "null" or "3" or in the best cases something reasonable, but
still out of context.

Just declare the test as throwing an exception. That will retain
the full stack trace leading to the point of failure without using
a debugger or changing the code.

Change-Id: Id2454d328d1aa665606ae002de2c3805fe7baa8e
2011-11-06 10:00:06 +01:00
Shawn Pearce 00235c77d6 Merge "Do not resolve path using cygwin unless told to" 2011-11-04 18:09:56 -04:00
Shawn Pearce 2f2c018819 Merge changes Icea2572d,I2633e472,I207c0c93,I10cee76c,Ifd78e8ce,I890b5fcc,Ia0b01f5d,Iec524abd
* changes:
  DfsBlockCache: Update hits to not include contains()
  Add a listener for changes to a DfsObjDatabase's pack files
  Expose the reverse index size in the DfsPackDescription
  Add a DfsPackFile method to get the number of cached bytes
  Expose the list of pack files in the DfsBlockCache
  Add a DFS repository description and reference it in each pack
  Clarify the docstring of DfsBlockCache.reconfigure()
  DFS: A storage layer for JGit
2011-11-04 18:06:30 -04:00
Shawn O. Pearce f3e37b5530 Merge "Refactor HTTP server stack to use Filter as base" 2011-11-04 18:05:09 -04:00
Colby Ranger f70ecabb30 DfsBlockCache: Update hits to not include contains()
Also expose the underlying hit and miss counters, in
addition to the hit ratio.

Change-Id: Icea2572d62e59318133b0a88848019f34ad70975
2011-11-04 11:14:32 -07:00
Dave Borowitz 0f8e486a4d Add a listener for changes to a DfsObjDatabase's pack files
Intended for cross-request use, so only refers to
DfsRepositoryDescriptions rather than DfsRepositorys.

Change-Id: I2633e472c9264d91d632069f608d53d4bdd0fc09
2011-11-04 11:14:32 -07:00
Dave Borowitz d55eb35106 Expose the reverse index size in the DfsPackDescription
This is analogous to the getPackSize() and getIndexSize() methods.

Change-Id: I207c0c93f9145826d84b3610eb4319fca074ee0d
2011-11-04 11:14:32 -07:00
Dave Borowitz 4fc1af6850 Add a DfsPackFile method to get the number of cached bytes
The counter is actually stored in the DfsPackKey so it can be
manipulated by the cache.

Change-Id: I10cee76c92d65c68d1aa1a9dd0c4fd7173c4cede
2011-11-04 11:14:32 -07:00
Dave Borowitz dff9d56b94 Expose the list of pack files in the DfsBlockCache
Callers may want to inspect the contents of the cache, which this allows
them to do in a read-only fashion without any locking.

Change-Id: Ifd78e8ce34e26e5cc33e9dd61d70c593ce479ee0
2011-11-04 11:14:32 -07:00
Dave Borowitz 35d72ac806 Add a DFS repository description and reference it in each pack
Just as DfsPackDescription describes a pack but does not imply it is
open in memory, a DfsRepositoryDescription describes a repository at a
basic level without it necessarily being open.

Change-Id: I890b5fccdda12c1090cfabf4083b5c0e98d717f6
2011-11-04 11:14:32 -07:00
Dave Borowitz 5a38e5b440 Clarify the docstring of DfsBlockCache.reconfigure()
The docstring was copied from the local filesystem cache code, which
actually attempted to reconfigure the cache on the fly. The DFS cache is
designed to be "reconfigured" exactly once.

Change-Id: Ia0b01f5d6b6b3d3a68d65a5c229ff67c1cede5bc
2011-11-04 11:14:32 -07:00
Shawn O. Pearce fa4cc2475f DFS: A storage layer for JGit
In practice the DHT storage layer has not been performing as well as
large scale server environments want to see from a Git server.

The performance of the DHT schema degrades rapidly as small changes
are pushed into the repository due to the chunk size being less than
1/3 of the pushed pack size.  Small chunks cause poor prefetch
performance during reading, and require significantly longer prefetch
lists inside of the chunk meta field to work around the small size.

The DHT code is very complex (>17,000 lines of code) and is very
sensitive to the underlying database round-trip time, as well as the
way objects were written into the pack stream that was chunked and
stored on the database.  A poor pack layout (from any version of C Git
prior to Junio reworking it) can cause the DHT code to be unable to
enumerate the objects of the linux-2.6 repository in a completable
time scale.

Performing a clone from a DHT stored repository of 2 million objects
takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row
for each object being cloned. This is very difficult for some DHTs to
scale, even at 5000 rows/second the lookup stage alone takes 6 minutes
(on local filesystem, this is almost too fast to bother measuring).
Some servers like Apache Cassandra just fall over and cannot complete
the 2 million lookups in rapid fire.

On a ~400 MiB repository, the DHT schema has an extra 25 MiB of
redundant data that gets downloaded to the JGit process, and that is
before you consider the cost of the OBJECT_INDEX table also being
fully loaded, which is at least 223 MiB of data for the linux kernel
repository.  In the DHT schema answering a `git clone` of the ~400 MiB
linux kernel needs to load 248 MiB of "index" data from the DHT, in
addition to the ~400 MiB of pack data that gets sent to the client.
This is 193 MiB more data to be accessed than the native filesystem
format, but it needs to come over a much smaller pipe (local Ethernet
typically) than the local SATA disk drive.

I also never got around to writing the "repack" support for the DHT
schema, as it turns out to be fairly complex to safely repack data in
the repository while also trying to minimize the amount of changes
made to the database, due to very common limitations on database
mutation rates..

This new DFS storage layer fixes a lot of those issues by taking the
simple approach for storing relatively standard Git pack and index
files on an abstract filesystem. Packs are accessed by an in-process
buffer cache, similar to the WindowCache used by the local filesystem
storage layer. Unlike the local file IO, there are some assumptions
that the storage system has relatively high latency and no concept of
"file handles". Instead it looks at the file more like HTTP byte range
requests, where a read channel is a simply a thunk to trigger a read
request over the network.

The DFS code in this change is still abstract, it does not store on
any particular filesystem, but is fairly well suited to the Amazon S3
or Apache Hadoop HDFS. Storing packs directly on HDFS rather than
HBase removes a layer of abstraction, as most HBase row reads turn
into an HDFS read.

Most of the DFS code in this change was blatently copied from the
local filesystem code. Most parts should be refactored to be shared
between the two storage systems, but right now I am hesistent to do
this due to how well tuned the local filesystem code currently is.

Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
2011-11-04 11:08:20 -07:00
Shawn Pearce 1783c8a831 Merge "Allow '\' in user names in URI-ish" 2011-11-04 13:10:47 -04:00
Robin Rosenberg afd4f3b0cf Allow '\' in user names in URI-ish
Actually this is not ok according to the RFC, but this implementation is
ment to be Git compatible. A '\' is needed when the authentication
requires or allows authentication to a Windows domain where the
user name can be specified as DOMAIN\user.

Change-Id: If02f258c032486f1afd2e09592a3c7069942eb8b
2011-11-04 17:54:43 +01:00
Shawn Pearce ede88c60a5 Merge "Provide an id for submodule entries." 2011-11-04 10:14:46 -04:00
Carl Myers 85a9ab7410 Fix NPE when PATH environment variable is empty
Change-Id: Ic27d509cd5e2d6c855e7d355fc308399d9dc01c9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-11-04 14:42:12 +01:00
Kevin Sawicki 931b931ee8 Provide an id for submodule entries.
Open a repository for submodule entries that have a child .git
directory and use the resolved HEAD commit as the entry's id.

Change-Id: I68d6e127f018b24ee865865a2dd3011a0e21453c
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-11-04 08:14:53 +01:00
Shawn Pearce 2efbcb7e44 Merge "Implement Config.Entry.toString() to help debugging" 2011-11-03 16:19:17 -04:00
Shawn Pearce c2e828abd6 Merge "DirCacheEntry: accessors for cached creation time (CTIME)" 2011-11-03 16:18:43 -04:00
Kevin Sawicki 5041f738e9 Suppress unused and unchecked warnings
Change-Id: I9f51cc749f5cb9d2e3aa86874e60fca29b779565
Signed-off-by: Kevin Sawicki <kevin@github.com>
2011-11-03 11:03:01 +01:00
Marc Strapetz bf81119e62 DirCacheEntry: accessors for cached creation time (CTIME)
Change-Id: I986d5fff63ff1a86cca6bab49c744ea673fe4892
2011-10-31 17:45:51 +01:00
Shawn O. Pearce 9f81c6813a Merge "Ensure the ObjectInserter flushes after a merge" 2011-10-28 21:14:26 -04:00
Robin Rosenberg 3ceb4fac23 Do not resolve path using cygwin unless told to
The system property jgit.cygpath must be set to true in order
for cygwin's cygpath to be used to translate path from cygwin
namespace to Windows namespace.

The cygwin path translation should be considered deprecated.

Bug: 353389
Change-Id: I2b5234c0ab936dac67d1e232f4cd28331bf3226d
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-10-28 14:58:32 +02:00
Matthias Sohn a5f72d6b3b Implement Config.Entry.toString() to help debugging
Change-Id: I86f6359d955d39ab033848b87ed39d20378d3c1f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-10-27 22:55:59 +02:00
Shawn Pearce b6281cac01 Merge "Enable full Transport configuration for JGit API commands" 2011-10-27 10:25:18 -04:00
Christian Halstrick b42293c81d Merge "Close the repo in CloneCommandTest" 2011-10-27 08:57:06 -04:00
Shawn O. Pearce b24a61272a Ensure the ObjectInserter flushes after a merge
If this does not happen some databases may discard
objects and not make them available.

Change-Id: I347b3c3724db52c8a6c09f4804071497a3a377ab
2011-10-26 20:34:52 -07:00
Matthias Sohn 34f678643c Merge changes I488e9c97,I30f1049f,I1c088dce
* changes:
  Cosmetic adjustment of relative date format, do not display "0 months"
  Make use of the many date formatting options in the log command
  Define a utility class for handling Git date formats
2011-10-26 17:29:23 -04:00
Robin Rosenberg 57bdb04873 Cosmetic adjustment of relative date format, do not display "0 months"
Though it may seem less precise, "0 months" looks bad and the reference
Git implementation also does not display "0 months"

Change-Id: I488e9c97656f9941788ae88d7c5c1562ab6c26f0
2011-10-26 23:15:28 +02:00
Robin Rosenberg 6baf0cb956 Make use of the many date formatting options in the log command
Change-Id: I30f1049fce086f2cf7e39ba3ad8b335df3a7b827
2011-10-26 23:15:24 +02:00
Robin Rosenberg 96b801f02b Close the repo in CloneCommandTest
The test failed on Windows only

Change-Id: Ibff5308b33deb73570626a08a04e86ad8f418023
2011-10-26 22:59:39 +02:00
Matthias Sohn 66cb4ac902 Merge "Allow detecting which files were renamed during a revwalk" 2011-10-26 16:18:21 -04:00
Carsten Pfeiffer 98d4bd6d36 Allow detecting which files were renamed during a revwalk
The egit history view shows the files associated with a commit by using
a PathFilter. When following renames with a FollowFilter, the PathFilter
cannot be configured anymore because the affected files are simply not
known.

Thus, it should be possible to get to know which files are renamed.

Bug: 302549
Change-Id: I4761e9f5cfb4f0ef0b0e1e38991401a1d5003bea
2011-10-25 09:22:11 +02:00
Robin Rosenberg 63bb6ff06c Fix compatibilty breakage for SystemReader
Introducing a new abstract method is not nice when one
expects other to subclass them. Create default implementations
so old code that implements SystemReader does not break.
The default methods just delegate to the JVM.

Change-Id: I42cdfdcb6b29f7203697a23833dca85185b0b9b3
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-10-23 22:53:17 +02:00
Robin Rosenberg f4460dda97 Define a utility class for handling Git date formats
Besides the formats known by git-log(1) we also add "locale" 
and "localelocal" that formats dates according to the user's locale.
"locale" does not translate into local timezone, while
localelocal does.

Change-Id: I1c088dcec992c107e43f6c17be4ac9ed6eb428bf
2011-10-23 01:51:30 +02:00