Commit Graph

456 Commits

Author SHA1 Message Date
Mathias Kinzler b7388637d8 Fix missing Configuration Change eventing
Configuration change events were not being triggered, now they are
forwarded from the FileConfig up to the Repository's listeners.

Change-Id: Ida94a59f5a2b7fa8ae0126e33c13343275483ee5
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-19 11:36:56 -07:00
Christian Halstrick 75c9b24385 Enhance MergeResult to report conflicts, etc
The MergeResult class is enhanced to report more data about a
three-way merge. Information about conflicts and the base, ours,
theirs commits can be retrived.

Change-Id: Iaaf41a1f4002b8fe3ddfa62dc73c787f363460c2
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-19 12:16:39 -05:00
Chris Aniszczyk 94ba9574cd Allow for optional tagger and message in Tag
We should be more lenient when tagging without an
tagger or message. Currently, we will throw an NPE
which is incorrect behavior.

Change-Id: I04e30ce25a9432e4ca56c3f29658ecb24fb18d24
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-18 19:22:42 -07:00
Chris Aniszczyk 6c9d82b4ce Remove getter and setter for author in Tag
There was a duplicated getter and setter for tagger in Tag.
There's no needed to have two getters and setters that represent
the same things. The appropriate tests were updated also.

Change-Id: If46dc00c4c0f31ea4234c6d3bda3c03e6ebbafac
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-18 20:58:25 -05:00
Christian Halstrick e02b68a8b7 added resetIndex() to RepositoryTestCase
Added a utility method to set the reset an index to match exactly
some content in the filesystem. This can be used by tests to prepare
commits in the working-tree and set the index in one shot.

[sp: Cleaned up formatting, added getEntryFile(), released inserter.]

Change-Id: If38b1f7cacaaf769f51b14541c5da0c1e24568a5
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-18 16:45:56 -07:00
Robin Rosenberg a85c08e1c8 Do not trigger RefsChangedEvent on the first attempt to read a ref
Such events make no sense, it has never been visible to this
process so no client can have a stale value of the ref.

Change-Id: Iea3a5035b0a1410b80b09cf53387b22b78b18018
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-08-18 16:32:45 -07:00
Ketan Padegaonkar 376acfb6db Add FileRepository(String) convenience constructor
Add a convenience API in FileRepository to pass in a String that
points to the GIT_DIR location.  This is converted to a File and
sent through the usual constructor.

Change-Id: I588388f37e89b8c690020f110a1bc59f46170c40
2010-08-18 16:30:20 -07:00
Matthias Sohn 2d3a806271 Backout RevObject's object-identity based equals implementation
This restores the transitivity and symmetry properties of the equals
methods on the AnyObjectId type hierarchy as defined in [1]. 
Following [2] we declare these equals methods final to ensure that 
semantics of equals are consistent across AnyObjectId's type hierarchy.

[1] http://download-llnw.oracle.com/javase/6/docs/api/java/lang/Object.html#equals(java.lang.Object)
[2] http://www.angelikalanger.com/Articles/JavaSolutions/SecretsOfEquals/Equals.html

Bug: 321502
Change-Id: Ibace21fa268c4aa15da6c65d42eb705ab1aa24b3
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-15 23:12:58 +02:00
Shawn Pearce 8d761febc3 Merge "Fix RevCommitList to work with subclasses of RevWalk" 2010-08-12 19:48:44 -04:00
Matthias Sohn 35b01dac4c Fix RevCommitList to work with subclasses of RevWalk
Bug: 321502
Change-Id: Ic4bc49a0da90234271aea7c0a4e344a1c3620cfc
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-13 01:47:17 +02:00
Jens Baumgart cd1141cd45 Improve IndexDiff performance
Exclude ignored files from IndexDiff tree walk.
This makes EGit commit much faster.

Change-Id: I398499510c22c37667b7612db32eac3b31d325f0
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-12 11:43:02 -05:00
Chris Aniszczyk cfe88d32a3 Merge "Hide Maven target directories from Eclipse" 2010-08-11 11:17:18 -04:00
Mathias Kinzler fe76b41038 TransportHttp does not honor timeout setting
This can result in an infinitely hanging IDE.

Change-Id: I669bc8d220a07011a42edf79de31825305ff3763
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-08-10 15:58:21 +02:00
Jens Baumgart 9a6a433576 Fix NPE on commit in empty Repository
NPE occured when committing in an empty repository.

Bug: 321858
Change-Id: Ibddb056c32c14c1444785501c43b95fdf64884b1
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2010-08-09 11:14:38 +02:00
Robin Rosenberg db4c516f67 Hide Maven target directories from Eclipse
Change-Id: I64f12a35423a90ced9c9bc83f6869d8ed766dd35
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-08-08 13:16:53 +02:00
Shawn O. Pearce 09130b8731 Merge branch 'rename-bug'
* rename-bug:
  Fix ArrayIndexOutOfBounds on non-square exact rename matrix

Conflicts:
	org.eclipse.jgit/src/org/eclipse/jgit/diff/RenameDetector.java

Change-Id: Ie0b8dd3e1ec174f79ba39dc4706bb0694cc8be29
2010-08-06 09:48:50 -07:00
Shawn O. Pearce e2f5716c94 Fix ArrayIndexOutOfBounds on non-square exact rename matrix
If the exact rename matrix for a particular ObjectId isn't square we
crashed with an ArrayIndexOutOfBoundsException because the matrix
entries were encoded backwards.  The encode function accepts the
source (aka deleted) index first, not second.  Add a unit test to
cover this non-square case to ensure we don't have this regression
in the future.

Change-Id: I5b005e5093e1f00de2e3ec104e27ab6820203566
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-06 09:39:10 -07:00
Shawn O. Pearce 8e9cc826e9 Merge changes I39bfefee,I47795987,I70d120fb,I58cc5e01,I96bee7b9
* changes:
  Enable configuration of non-standard pack settings
  Pass PackConfig down to PackWriter when packing
  Simplify UploadPack use of options during writing
  Move PackWriter configuration to PackConfig
  Allow PackWriter callers to manage the thread pool
2010-08-05 21:11:31 -04:00
Shawn O. Pearce 97e93ca1ea Merge "Remove static progress task names from PackWriter" 2010-08-05 21:11:30 -04:00
Chris Aniszczyk b69900a415 Merge "Add "all" parameter to the commit Command" 2010-08-05 15:02:59 -04:00
Chris Aniszczyk ad4274abcc Merge "Add the parameter "update" to the Add command" 2010-08-05 15:01:53 -04:00
Stefan Lay 4b464ed458 Allow to replace existing Change-Id
It is useful to be able to replace an existing Change-Id
in the message, for example if the user decides not to
amend the previous commit.

Bug: 321188
Change-Id: I594e7f9efd0c57d794d2bd26d55ec45f4e6a47fd
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-05 12:23:38 -05:00
Shawn O. Pearce 60c5939b23 Rename getOldName,getNewName to getOldPath,getNewPath
TreeWalk calls this value "path", while "name" is the stuff after the
last slash.  FileHeader should do the same thing to be consistent.
Rename getOldName to getOldPath and getNewName to getNewPath.

Bug: 318526
Change-Id: Ib2e372ad4426402d37939b48d8f233154cc637da
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-04 11:00:07 -07:00
Shawn O. Pearce 7514a6dbdc Merge branch 'js/diff'
* js/diff:
  Fixed bug in scoring mechanism for rename detection
2010-08-04 10:59:35 -07:00
Jeff Schumacher e64cb03065 Fixed bug in scoring mechanism for rename detection
A bug in rename detection would cause file scores to be wrong. The
bug was due to the way rename detection would judge the similarity
between files. If file A has three lines containing 'foo', and file
B has 5 lines containing 'foo', the rename detection phase should
record that A and B have three lines in common (the minimum of the
number of times that line appears in both files). Instead, it would
choose the the number of times the line appeared in the destination
file, in this case file B. I fixed the bug by having the
SimilarityIndex instead choose the minimum number, as it should. I
also added a test case to verify that the bug had been fixed.

Change-Id: Ic75272a2d6e512a361f88eec91e1b8a7c2298d6b
2010-08-04 10:56:19 -07:00
Jens Baumgart 3ba1c7c068 Add gitignore support to IndexDiff and use TreeWalk
IndexDiff was re-implemented and now uses TreeWalk instead
of GitIndex. Additionally, gitignore support and retrieval of
untracked files was added.

Change-Id: Ie6a8e04833c61d44c668c906b161202b200bb509
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-04 10:03:20 -05:00
Stefan Lay ab57af08e8 Add "all" parameter to the commit Command
When the add parameter is set all modified and deleted files
are staged prior to commit.

Change-Id: Id23bc25730fcdd151386cd495a7cdc0935cbc00b
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-08-04 13:53:08 +02:00
Stefan Lay fa7d9ac5b8 Add the parameter "update" to the Add command
This change is mainly done for a subsequent commit
which will introduce the "all" parameter to the Commit
command.

Bug: 318439
Change-Id: I85a8a76097d0197ef689a289288ba82addb92fc9
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-08-04 13:36:45 +02:00
Christian Halstrick 94207f0a43 Make use of Repository.writeMerge...()
The CommitCommand should not use java.io to delete MERGE_HEAD and MERGE_MSG
files since Repository already has utility methods for that.

Change-Id: If66a419349b95510e5b5c2237a91f06c1d5ba0d4
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-07-29 15:12:14 +02:00
Christian Halstrick fba2437111 Merge "Fix tag sorting in PlotWalk" 2010-07-28 17:13:27 -04:00
Shawn O. Pearce 5f5da8b1d4 Enable configuration of non-standard pack settings
For daemons we might want to disable delta compression entirely, or
in some strange case an administrator might need to turn of delta
reuse.  Expose these normally internal pack settings through the pack
configuration section.

Change-Id: I39bfefee8384c864cc04ffac724f197240c8a11a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 12:13:48 -07:00
Shawn O. Pearce 9fbce904e6 Pass PackConfig down to PackWriter when packing
When we are creating a pack the higher level application should be able
to override the PackConfig used, allowing it to control the number of
threads used or how much memory is allocated per writer.

Change-Id: I47795987bb0d161d3642082acc2f617d7cb28d8c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 12:13:48 -07:00
Shawn O. Pearce bb99ec0aa0 Simplify UploadPack use of options during writing
We only use these variables once, so just put them at the proper
use site and avoid assigning the local variable.  The code is a
bit shorter and the intent is a little bit more clear.

Change-Id: I70d120fb149b612ac93055ea39bc053b8d90a5db
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 12:13:48 -07:00
Shawn O. Pearce 1a06179ea7 Move PackWriter configuration to PackConfig
This refactoring permits applications to configure global per-process
settings for all packing and easily pass it through to per-request
PackWriters, ensuring that the process configuration overrides the
repository specific settings.

For example this might help in a daemon environment where the server
wants to cap the resources used to serve a dynamic upload pack
request, even though the repository's own pack.* settings might be
configured to be more aggressive.  This allows fast but less bandwidth
efficient serving of clients, while still retaining good compression
through a cron managed `git gc`.

Change-Id: I58cc5e01b48924b1a99f79aa96c8150cdfc50846
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 12:13:48 -07:00
Mathias Kinzler 6e59e6dab9 Meaningful error message when trying to check-out submodules
Currently, a NullPointerException occurs in this case. We should
instead throw a more meaningful Exception with a proper message.
This is a very "stupid" implementation which simply checks for
the existence of a ".gitmodules" file.

Bug: 300731
Bug: 306765
Bug: 308452
Bug: 314853
Change-Id: I155aa340a85cbc5d7d60da31dba199fc30689b67
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-07-28 11:59:07 -07:00
Christian Halstrick 08c0c5d938 Fix unit tests under windows
the following tests fail under windows because certain inputstreams
are not closed and files cannot be deleted because of that.  The
main problem I found is UnpackedObject.InflaterInputStream.close().
This method may throw exceptions found by checkValidEndOfStream()
but doesn't call super.close() before leaving. It is not clear to me
which resources a close() method should release before it throws an
exception. But those reseources which are not published to the
outside and which therefore cannot be closed by other means have to
be closed in all cases.
I changed the close() method to call super.close() under all
circumstances.

failing tests:
  testStandardFormat_LargeObject_TruncatedZLibStream(org.eclipse.jgit.storage.file.UnpackedObjectTest)
  testStandardFormat_LargeObject_TrailingGarbage(org.eclipse.jgit.storage.file.UnpackedObjectTest)
  testPackFormat_SmallObject(org.eclipse.jgit.storage.file.UnpackedObjectTest)

Change-Id: Id2e609a29e725aad953ff9bd88af6381df38399d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-07-28 11:55:11 -07:00
Shawn O. Pearce d0f8d1e819 Fix tag sorting in PlotWalk
By deferring tag sorting until the commit is produced by the walker
we can avoid an infinite loop that was triggered by trying to sort
tags while allocating a commit.  This also avoids needing to look
at commits which aren't going to be produced in the result.

Bug: 321103
Change-Id: I25acc739db2ec0221a50b72c2d2aa618a9a75f37
Reviewed-by: Mathias Kinzler <mathias.kinzler@sap.com>
Reviewed-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 11:51:17 -07:00
Shawn O. Pearce 21f76c2a69 Remove static progress task names from PackWriter
These need to be dynamic based on the current thread's environment
at time of execution in order to be properly localized for the end
user that will be seeing these messages.

Change-Id: I4976f462cfe606edd2761c0e36b2f6b20f63d53c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 10:50:28 -07:00
Shawn O. Pearce 1b783d0370 Allow PackWriter callers to manage the thread pool
By permitting the caller of PackWriter to select the Executor it
uses for task execution, we give the caller the ability to manage
the lifecycle of the thread pool, including reusing it across
concurrent pack generators.

This is the first step to supporting application thread pools
within Daemon or another managed service like Gerrit Code Review.

Change-Id: I96bee7b9c30ff9885f2bd261d0b6daaac713b5a4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 10:50:28 -07:00
Christian Halstrick 74d279fbf0 Teach NameConflictTreeWalk to report DF conflicts
Add a method isDirectoryFileConflict() to NameConflictTreeWalk which
tells whether the current path is part of a directory/file conflict.

Change-Id: Iffcc7090aaec743dd6f3fd1a333cac96c587ae5d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 17:26:31 +02:00
Mathias Kinzler 51c6f513b0 Stack Overflow in EGit History View
This is caused by a recursion in PlotWalk.getTags().
As a hotfix, the sort was simply removed. The sort
must be re-implemented so that parseAny() is not called
again (currently, this happens in the PlotRefComparator).

Change-Id: I060d26fda8a75ac803acaf89cfb7d3b4317328f3
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-07-28 11:46:05 +02:00
Jeff Schumacher 396fe6da45 Break dissimilar file pairs during diff
File pairs that are very dissimilar during a diff were not being
broken apart into their constituent ADD/DELETE pairs. The leads to
sub-optimal rename detection. Take, for example, this situation:

A file exists at src/a.txt containing "foo". A user renames src/a.txt
to src/b.txt, then adds a new src/a.txt containing "bar".

Even though the old a.txt and the new b.txt are identical, the
rename detection algorithm would not detect it as a rename since
it was already paired in a MODIFY. I added code to split all
MODIFYs below a certain score into their constituent ADD/DELETE
pairs. This allows situations like the one I described above to be
more correctly handled.

Change-Id: I22c04b70581f206bbc68c4cd1ee87a1f663b418e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-27 18:13:32 -07:00
Christian Halstrick f56a459966 Add methods which write MERGE_HEAD and MERGE_MSG
Add methods to the Repository class which write into MERGE_HEAD
and MERGE_MSG files. Since we have the read methods in the same
class this seems to be the right place.

Change-Id: I5dd65306ceb06e008fcc71b37ca3a649632ba462
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-27 11:48:23 -07:00
Jens Baumgart db82b8d7eb Fix concurrent read / write issue in LockFile on Windows
LockFile.commit fails if another thread concurrently reads
the base file. The problem is fixed by retrying the rename
operation if it fails.

Change-Id: I6bb76ea7f2e6e90e3ddc45f9dd4d69bd1b6fa1eb
Bug: 308506
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2010-07-27 10:00:47 -07:00
Robin Stocker a00377a7e2 Fix Javadoc warnings
There were some broken links, incorrect uses of @value, an invalid
tag and an outdated comment.

Change-Id: I22886bcc869a4b62bd606ebed40669f7b4723664
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-27 09:40:01 -07:00
Shawn O. Pearce 80fe789690 Make forPath(ObjectReader) variant in TreeWalk
This simplifies the logic for those who already have an ObjectReader
on hand want to reuse it to lookup a single path.

Change-Id: Ief17d6b2a0674ddb34bbc9f43121b756eae960fb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-27 08:36:24 -07:00
Shawn O. Pearce 7ff18f3ec9 Make StoredConfig an abstraction above FileBasedConfig
This exposes a load and save method, allowing a Repository to denote
that it has a persistent configuration of some kind which can be
accessed by the application, without needing to know exact details
of how its stored .

Change-Id: I7c414bc0f975b80f083084ea875eca25c75a07b2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-26 16:50:11 -07:00
Shawn O. Pearce fa9b225e06 Merge branch 'delta'
* delta: (103 commits)
  Discard the uncompressed delta as soon as its compressed
  Honor pack.windowlimit to cap memory usage during packing
  Honor pack.threads and perform delta search in parallel
  Cache small deltas during packing
  Implement delta generation during packing
  debug-show-packdelta:  Dump a pack delta to the console
  Initial pack format delta generator
  Add debugging toString() method to ObjectToPack
  Make ObjectToPack clearReuseAsIs signal available to subclasses
  Correctly classify the compressing objects phase
  Refactor ObjectToPack's delta depth setting
  Configure core.bigFileThreshold into PackWriter
  Add doNotDelta flag to ObjectToPack
  Add more configuration options to PackWriter
  Save object path hash codes during packing
  Add path hash code to ObjectWalk
  Add getObjectSize to ObjectReader
  Allow TemporaryBuffer.Heap to allocate smaller than 8 KiB
  Define a constant for 127 in DeltaEncoder
  Cap delta copy instructions at 64k
  ...

Conflicts:
	org.eclipse.jgit.pgm/src/org/eclipse/jgit/pgm/Diff.java
	org.eclipse.jgit/resources/org/eclipse/jgit/JGitText.properties
	org.eclipse.jgit/src/org/eclipse/jgit/JGitText.java
	org.eclipse.jgit/src/org/eclipse/jgit/revwalk/RewriteTreeFilter.java

Change-Id: I7c7a05e443a48d32c836173a409ee7d340c70796
2010-07-22 14:56:34 -07:00
Stefan Lay ab062caa22 Allow client of Add command to set a WorkingTreeIterator
This is e.g. useful when a client of the AddCommand has
additional rules to ignore files. In Eclipse a resource can
be set to derived or be excluded by preferences.

Change-Id: I6c47e54a1ce26315faf5ed0723298ad2c2db197c
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-22 14:57:00 +02:00
Stefan Lay 88957f6c5a Allow for filepattern "." in AddCommand
Enable adding on repository root level.

Change-Id: I415b10dc74cc9435578424d9f106c972fd703055
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-22 14:27:35 +02:00
Stefan Lay aa86cfc339 Do not add ignored files in Add command
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-22 11:26:04 +02:00
Shawn O. Pearce 09910ffa32 Move ignore node handling into WorkingTreeIterator
The working tree iterator has perfect knowledge of the path structure
as well as immediate information about whether or not an ignore file
even exists at this level.  We can exploit that to simplify the
logic and running time for testing ignored file status by pushing
all of the checks down into the iterator itself.

Change-Id: I22ff534853e8c5672cc5c2d9444aeb14e294070e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Charley Wang <chwang@redhat.com>
CC: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Stefan Lay <stefan.lay@sap.com>
CC: Matthias Sohn <matthias.sohn@sap.com>
2010-07-21 10:34:08 -07:00
Shawn Pearce 0ec0e21fdf Merge "Fix concurrent read / write issue in GitIndex on Windows" 2010-07-21 13:08:01 -04:00
Jens Baumgart e99c48a61a Fix concurrent read / write issue in GitIndex on Windows
GitIndex.write fails if another thread concurrently reads
the index file. The problem is fixed by retrying the rename
operation if it fails.

Bug: 311051
Change-Id: Ib243d2a90adae312712d02521de4834d06804944
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2010-07-21 09:35:15 +02:00
Christian Halstrick 5c94321b47 Check for racy git in WorkingTreeIterator
The WorkingTreeIterator has a method to check whether
the current file differs from the corresponding index
entry. This commit improves this check to also handle
racy git situations.

See http://git.kernel.org/?p=git/git.git;a=blob;f=Documentation/technical/racy-git.txt;hb=HEAD

Change-Id: I3ad0897211dcbb2eac9eebcb19d095a5052fb06b
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-07-20 21:55:18 +02:00
Christian Halstrick c98d97731b Smudge racily clean index entries by truncating length (like git.git)
To mark an entry racily clean we set its length to 0 (like native git
does). Entries which are not racily clean and have zero length can be
distinguished from racily clean entries by checking P_OBJECTID
against the SHA1 of empty content. When length is 0 and P_OBJECTID is
different from SHA1 of empty content we know the entry is marked
racily clean.

See http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg00488.html

Change-Id: I689552931441ab51964b430b303160c9126b66af
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-07-20 21:54:36 +02:00
Shawn O. Pearce 938943d674 Use proper constants for .gitignore and .git directory
We have a constant for .gitignore, so use it.  While we are in
the same method, correct the reference of ".git" to be the actual
GIT_DIR given.  This might not be within the work tree if the
GIT_DIR and GIT_WORK_TREE environment variables were used.

Change-Id: I38e1cec13405109b9c347858b38dd9fb2f1f2560
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Charley Wang <chwang@redhat.com>
CC: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Stefan Lay <stefan.lay@sap.com>
CC: Matthias Sohn <matthias.sohn@sap.com>
2010-07-20 09:11:39 -07:00
Shawn O. Pearce c59db09bc5 Remove gitIgnoreTimestamp from abstract iterator API
This never should have been exposed on the top of the
AbstractTreeIterator type hierarchy.  There is no concept of a
timestamp in a canonical tree read from the object database, and
the time in the DirCache isn't what we want here either.

Actually all that we need is to find the files whose names are
".gitignore" and are below the root directory.  We can accomplish
that with a suffix filter, and process them immediately.

Change-Id: Ib09cbf81a9e038452ce491385c65498312e2916b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Charley Wang <chwang@redhat.com>
CC: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Stefan Lay <stefan.lay@sap.com>
CC: Matthias Sohn <matthias.sohn@sap.com>
2010-07-20 09:09:01 -07:00
Shawn O. Pearce 395d236058 Fix NPE in RenameDetector
If we have two adds of the same object but no deletes the detector
threw an NPE because the entry that came back from the deleted map
was null (no matching objects).  In this case we need to put the
adds all back onto the list of left over additions since they did
not match a delete.

Change-Id: Ie68fbe7426b4dc0cb571a08911c7adbffff755d5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Jeffrey Schumacher" <jeffschu@google.com>
2010-07-20 07:52:35 -07:00
Shawn O. Pearce b518189b5c IndexPack: Fix spurious pack file corruption errors
We didn't correctly handle the zlib trailer for an object.  If the
trailer bytes were outside of the current buffer window but we had
fully inflated the object itself, we broke out of the loop (as we had
our target size) but inflate wasn't finished (as it did not yet get
the trailer) so we failed the test and threw a corruption exception.

Use an infinite loop and only break out when the inflater is done.

Change-Id: I7c9bbbeb577a990d9bc56a50ebd485935460f6c8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-20 07:40:48 -07:00
Shawn O. Pearce 12fe0f2d1e Discard the uncompressed delta as soon as its compressed
The DeltaCache will most likely need to copy the compressed delta
into a new buffer in order to compact away the wasted space at the
end caused by over allocation.  Since we don't need the uncompressed
format anymore, null out our only reference to it so the GC can
reclaim this memory if it needs to perform a collection in order
to satisfy the cache's allocation attempt.

Change-Id: I50403cfd2e3001b093f93a503cccf7adab43cc9d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-16 10:41:09 -07:00
Shawn O. Pearce 6e155d5f41 Merge branch 'js/rename'
* js/rename:
  Implemented file path based tie breaking to exact rename detection
  Added more test cases for RenameDetector
  Added very small optimization to exact rename detection
  Fixed Misleading Javadoc
  Added file path similarity to scoring metric in rename detection
  Fixed potential div by zero bug
  Added file size based rename detection optimization
  Create FileHeader from DiffEntry
  log: Implement --follow
  Cache the diff configuration section
  log: Add whitespace ignore options
  Format submodule links during differences
  Redo DiffFormatter API to be easier to use
  log, diff: Add rename detection support
  Implement similarity based rename detection
  Added a preliminary version of rename detection
  Refactored code out of FileHeader to facilitate rename detection
2010-07-16 10:22:15 -07:00
Shawn O. Pearce 0b46e70155 Fix infinite loop in IndexPack
A programming error using the Inflater API led to an infinite
loop within IndexPack, caused by the Inflater returning 0 from
the inflate() method, but it didn't want more input.  This happens
when it has reached the end of the stream, or has reached a spot
asking for an external dictionary.  Such a case is a failure for us,
and we should abort out.

Thanks to Alex for pointing out that we had 3 implementations of
the inflate rountine, which should be consolidated into one and
use a switch to determine where to load data from.

Bug: 317416
Change-Id: I34120482375b687ea36ed9154002d77047e94b1f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-16 10:12:04 -07:00
Jeff Schumacher 31311cacfd Implemented file path based tie breaking to exact rename detection
During the exact rename detection phase in RenameDetector, ties were
resolved on a first-found basis. I added support for file path based
tie breaking during that phase. Basically, there are four situations
that have to be handled:

One add matching one delete:
	In this simple case, we pair them as a rename.

One add matching many deletes:
	Find the delete whos path matches the add the closest, and
	pair them as a rename.

Many adds matching one delete:
	Similar to the above case, we find the add that matches the
	delete the closest, and pair them as a rename. The other adds
	are marked as copies of the delete.

Many adds matching many deletes:
	Build a scoring matrix similar to the one used for content-
	based matching, scoring instead by file path. Some of the
	utility functions in SimilarityRenameDetector are used in
	this case, as we use the same encoding scheme. Once the
	matrix is built, scan it for the best matches, marking them
	as renames. The rest are marked as copies.

I don't particularly like the idea of using utility functions right
out of SimilarityRenameDetector, but it works for the moment. A later
commit will likely refactor this into a common utility class, as well
as bringing exact rename detection out of RenameDetector and into a
separate class, much like SimilarityRenameDetector.

Change-Id: I1fb08390aebdcbf20d049aecf402a36506e55611
2010-07-16 09:56:42 -07:00
Christian Halstrick b840ed0121 Added dirty-detection to WorkingTreeIterator
Added possibility to compare the current entry of a WorkingTreeIterator
to a given DirCacheEntry. This is done to detect whether an entry
in the index is dirty or not. 'Dirty' means that the file in the working tree
is different from what's in the index. Merge algorithms will make use of
this to detect conflicts.

Change-Id: I3ff847f4bf392553dcbd6ee236c6ca32a13eedeb
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-07-16 10:08:52 +02:00
Shawn Pearce 19473b1dbc Merge "Handle the tilde notation (~user) of git url" 2010-07-15 17:29:21 -04:00
Robin Rosenberg 845714158a Handle the tilde notation (~user) of git url
When the path is prefixed with ~ the URI parser thought about this
as /~. Strip the / if the next character is the tilde.

Bug: 307017
Change-Id: I58203e5617956b46d83e8987d1f8042beddffac3
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-07-15 01:16:09 +02:00
Stefan Lay 233e0130b5 Git Porcelain API: Add Command
The new Add command adds files to the Git Index. 
It  uses the DirCache to access the git index. It 
works also in case of an existing conflict. 

Fileglobs (e.g. *.c) are not yet supported. 

The new Add command does add ignored files because
there is no gitignore support in jgit yet.

Bug: 318440
Change-Id: If16fdd4443e46b27361c2a18ed8f51668af5d9ff
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-14 11:24:58 +00:00
Shawn Pearce 0ef99921fa Merge changes I104cd62f,I1d0238b4
* changes:
  Internationalize RepositoryState descriptions
  Say that commit is allowed during bisect
2010-07-13 20:36:25 -04:00
Charley Wang b878cdcf6b Add compatibility with gitignore specifications
This patch adds ignore compatibility to jgit. It encompasses
exclude files as well as .gitignore. Uses TreeWalk and
FileTreeIterator to find nodes and parses .gitignore
files when required. The patch includes a simple cache that
can be used to save results and avoid excessive gitignore
parsing.

CQ: 4302
Bug: 303925
Change-Id: Iebd7e5bb534accca4bf00d25bbc1f561d7cad11b
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-07-13 00:34:15 +02:00
Jeff Schumacher bc08fafb41 Added very small optimization to exact rename detection
Optimized a small loop in findExactRenames. The loop would go through
all the items in a list of DiffEntries even after it already found
what it was looking for. I made it break out of the loop as soon as
a good match was found.

Change-Id: I28741e0c49ce52d8008930a87cd1db7037700a61
2010-07-12 12:54:01 -07:00
Jeff Schumacher a20e6f6fec Fixed Misleading Javadoc
The javadoc for the setRenameLimit method in RenameDetector said
that you could only have limits in the range (0,100), implying
that 0 and 100 were illegal inputs. The code, however, allowed 0 and
100. I changed the javadoc to say that the range [0,100] was legal.
I also documented the IllegalArgumentException that is thrown if the
limit is outside that range.

Change-Id: I916838f254859f6f0e1516bb55b8e7dc87e57dc2
2010-07-12 12:54:01 -07:00
Jeff Schumacher 9a48de86d8 Added file path similarity to scoring metric in rename detection
The scoring method was not taking into account the similarity of
the file paths and file names. I changed the metric so that it is 99%
based on content (which used to be 100% of the old metric), and 1%
based on path similarity. Of that 1%, half (.5% of the total final
score) is based on the actual file names (e.g. "foo.java"), and half
on the directory (e.g. "src/com/foo/bar/").

Change-Id: I94f0c23bf6413c491b10d5625f6ad7d2ecfb4def
2010-07-12 12:52:05 -07:00
Jeff Schumacher 4c14b7869d Fixed potential div by zero bug
The scoring logic in SimilarityIndex was dividing by the max file
size. If both files are empty, this would cause a div by zero
error. This case cannot currently happen, since two empty files
would have the same SHA1, and would therefore be caught in the
earlier SHA1 based detection pass. Still, if this logic eventually
gets separated from that pass, a div by zero error would occur.

I changed the logic to instead consider two empty files to have a
similarity score of 100.

Change-Id: Ic08e18a066b8fef25bb5e7c62418106a8cee762a
2010-07-12 12:24:42 -07:00
Jeff Schumacher 64b9458640 Added file size based rename detection optimization
Prior to this change, files that were very different in size (enough
so that they could not have enough in common to be detected as
renames) were still having their scores calculated. I added an
optimization to skip such files. For example, if the rename detection
threshold is 60%, the larger file is 200kb, and the smaller file is
50kb, the pair cannot be counted as a rename since they cannot
possibly share 60% of their content in common. (200*.6=120, 120>50)

Change-Id: Icd8315412d5de6292839778e7cea7fe6f061b0fc
2010-07-12 12:24:42 -07:00
Robin Rosenberg d787a82e50 Internationalize RepositoryState descriptions
Change-Id: I104cd62f3e89acf010b1d40a2b08e7f68f63bb85
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-07-10 10:24:37 +02:00
Shawn O. Pearce 9734194917 Honor pack.windowlimit to cap memory usage during packing
The pack.windowlimit configuration parameter places an upper bound
on the number of bytes used by the DeltaWindow class as it scans
through the object list.  If memory usage would exceed the limit
the window is temporarily decreased in size to keep memory used
within that bound.

Change-Id: I09521b8f335475d8aee6125826da8ba2e545060d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:19:07 -07:00
Shawn O. Pearce 74e0835012 Honor pack.threads and perform delta search in parallel
If we have multiple CPUs available, packing usually goes faster
when each CPU is assigned a slice of the available search space.
The number of threads to use is guessed from the runtime if it
wasn't set by the caller, or wasn't set in the configuration.

Change-Id: If554fd8973db77632a52a0f45377dd6ec13fc220
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:17:30 -07:00
Shawn O. Pearce a960d1429e Cache small deltas during packing
PackWriter now caches small deltas, or deltas that are very tiny
compared to their source inputs, so that the writing phase goes
faster by reusing those cached deltas.

The cached data is stored compressed, which usually translates to
a bigger footprint due to deltas being very hard to compress, but
saves time during writing by avoiding the deflate step.  They are
held under SoftReferences so that the JVM GC can clear out deltas
if memory gets very tight.  We would rather continue working and
spend a bit more CPU time during writing than crash due to OOME.

To avoid OutOfMemoryErrors during the caching phase we also trap
OOME and just abort out of the caching.

Because deflateBound() always produces something larger than what
we need to actually store the deflated data, we copy it over into
a new buffer if the actual length doesn't match the buffer length.
When packing jgit.git this saves over 111 KiB in the cache, and is
thus a worthwhile hit on CPU time.

To further save memory we store the inflated size of the delta
(which we need for the object header) in the same field as the
pathHash, as the pathHash is no longer necessary by this phase
of the packing algorithm.

Change-Id: I0da0c600d845e8ec962289751f24e65b5afa56d7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:15:54 -07:00
Shawn O. Pearce dfad23bf3d Implement delta generation during packing
PackWriter now produces new deltas if there is not a suitable delta
available for reuse from an existing pack file.  This permits JGit to
send less data on the wire by sending a delta relative to an object
the other side already has, instead of sending the whole object.

The delta searching algorithm is similar in style to what C Git
uses, but apparently has some differences (see below for more on).
Briefly, objects that should be considered for delta compression are
pushed onto a list.  This list is then sorted by a rough similarity
score, which is derived from the path name the object was discovered
at in the repository during object counting.  The list is then
walked in order.

At each position in the list, up to $WINDOW objects prior to it
are attempted as delta bases.  Each object in the window is tried,
and the shortest delta instruction sequence selects the base object.
Some rough rules are used to prevent pathological behavior during
this matching phase, like skipping pairings of objects that are
not similar enough in size.

PackWriter intentionally excludes commits and annotated tags from
this new delta search phase.  In the JGit repository only 28 out
of 2600+ commits can be delta compressed by C Git.  As the commit
count tends to be a fair percentage of the total number of objects
in the repository, and they generally do not delta compress well,
skipping over them can improve performance with little increase in
the output pack size.

Because this implementation was rebuilt from scratch based on my own
memory of how the packing algorithm has evolved over the years in
C Git, PackWriter, DeltaWindow, and DeltaEncoder don't use exactly
the same rules everywhere, and that leads JGit to produce different
(but logically equivalent) pack files.

  Repository | Pack Size (bytes)                | Packing Time
             | JGit     - CGit     = Difference | JGit / CGit
  -----------+----------------------------------+-----------------
   git       | 25094348 - 24322890 = +771458    | 59.434s / 59.133s
   jgit      |  5669515 -  5709046 = - 39531    |  6.654s /  6.806s
   linux-2.6 |     389M -     386M = +3M        |  20m02s / 18m01s

For the above tests pack.threads was set to 1, window size=10,
delta depth=50, and delta and object reuse was disabled for both
implementations.  Both implementations were reading from an already
fully packed repository on local disk.  The running time reported
is after 1 warm-up run of the tested implementation.

PackWriter is writing 771 KiB more data on git.git, 3M more on
linux-2.6, but is actually 39.5 KiB smaller on jgit.git.  Being
larger by less than 0.7% on linux-2.6 isn't bad, nor is taking an
extra 2 minutes to pack.  On the running time side, JGit is at a
major disadvantage because linux-2.6 doesn't fit into the default
WindowCache of 20M, while C Git is able to mmap the entire pack and
have it available instantly in physical memory (assuming hot cache).

CGit also has a feature where it caches deltas that were created
during the compression phase, and uses those cached deltas during
the writing phase.  PackWriter does not implement this (yet),
and therefore must create every delta twice.  This could easily
account for the increased running time we are seeing.

Change-Id: I6292edc66c2e95fbe45b519b65fdb3918068889c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:14:18 -07:00
Shawn O. Pearce 074055d747 debug-show-packdelta: Dump a pack delta to the console
This is a horribly crude application, it doesn't even verify that
the object its dumping is delta encoded.  Its method of getting the
delta is pretty abusive to the public PackWriter API, because right
now we don't want to expose the real internal low-level methods
actually required to do this.

Change-Id: I437a17ceb98708b5603a2061126eb251e82f4ed4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:12:32 -07:00
Shawn O. Pearce 8612c0ace1 Initial pack format delta generator
DeltaIndex is a simple pack style delta generator.  The function works
by creating a compact index of a source buffer's blocks, and then
walking a sliding window along a desired result buffer, searching for
the window in the index.  When a match is found, the window is
stretched to the longest possible length that is common with the
source buffer, and a copy instruction is created.

Rabin's polynomial hash function is used to compute the hash for a
block, permitting efficient sliding of the window in single byte
increments.  The update function to slide one byte originated from
David Mazieres' work in LBFS, and our implementation of the update
step was certainly inspired by the initial work Geert Bosch proposed
for C Git in http://marc.info/?l=git&m=114565424620771&w=2.

To ensure the encoder runs in linear time with respect to the size of
the two input buffers (source and result), the maximum number of
blocks that can share the same position in the index's hashtable is
capped at a constant number.  This prevents bad inputs from causing
the encoder to run in quadratic time, but comes with a penalty of
creating a longer delta due to fewer considered copy positions.

Strange hackery is used to cap the amount of memory used by the index
to be no more than 12 bytes for every 16 bytes of source buffer, no
matter what the JVM per-object overhead is.  This permits an index to
always be no larger than 1.75x the source buffer length, which is an
important feature to support large windows of candidates to match
against while packing.  Here the strange hackery is nothing more than
a manually managed chained hashtable, where pointers are array indexes
into storage arrays rather than object references.

Computation of the hash function for a single fixed sized block is
done through an unrolled loop, where the first 4 iterations have been
manually reduced down to eliminate unnecessary instructions.  The
pattern is derived from ObjectId.equals(byte[], int, byte[], int),
where we have unrolled the loop required to compare two 20 byte
arrays.  Hours of testing with the Sun 1.6 JRE concluded that the
non-obvious "foo[idx + 1]" style of reference is faster than
"foo[idx++]", and so that is what we use here during hashing.

Change-Id: If9fb2a1524361bc701405920560d8ae752221768
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:10:55 -07:00
Shawn O. Pearce b38426ae8c Add debugging toString() method to ObjectToPack
Its useful to know what the flags are or what the base that was
selected is.  Dump these out as part of the object's toString.

Change-Id: I8810067fb8337b08b4fcafd5f9ea3e1e31ca6726
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:09:19 -07:00
Shawn O. Pearce 699e4aa7c5 Make ObjectToPack clearReuseAsIs signal available to subclasses
A subclass may want to use this method to release handles that are
caching reuse information.  Make it protected so they can override
it and update themselves.

Change-Id: I2277a56ad28560d2d2d97961cbc74bc7405a70d4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:07:45 -07:00
Shawn O. Pearce 4569d77e13 Correctly classify the compressing objects phase
Searching for reuse candidates should be fast compared to actually
doing delta compression.  So pull the progress monitor out of this
phase and rename it back to identify the compressing objects state.

Change-Id: I5eb80919f21c1251e0e3420ff7774126f1f79b27
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:06:10 -07:00
Shawn O. Pearce 85b7a53d52 Refactor ObjectToPack's delta depth setting
Long ago when PackWriter is first written we thought that the delta
depth could be updated automatically.  But its never used.  Instead
make this a simple standard setter so the caller can more directly
set the delta depth of this object.  This permits us to configure a
depth that takes into account more than just the depth of another
object in this same pack.

Change-Id: I1d71b74f2edd7029b8743a2c13b591098ce8cc8f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:04:35 -07:00
Shawn O. Pearce 6730f9e3c8 Configure core.bigFileThreshold into PackWriter
C Git's fast-import uses this to determine the maximum file size
that it tries to delta compress, anything equal to or above this
setting is stored with as a whole object with simple deflate.

Define the configuration so we can use it later.

Change-Id: Iea46e787d019a1b6c51135cc73d7688a02e207f5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:02:54 -07:00
Shawn O. Pearce 823e9a9721 Add doNotDelta flag to ObjectToPack
This flag will later control whether or not PackWriter search for a
delta base for this object.  Edge objects will never get searched,
as the writer won't be outputting them, so they should always have
this flag set on.  Sometime in the future this flag should also be
set for file blobs on file paths that have the "-delta" gitattribute
set in the repository's attributes file.

Change-Id: I6e518e1a6996c8ce00b523727f1b605e400e82c6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:00:49 -07:00
Shawn O. Pearce 616bc74cf7 Add more configuration options to PackWriter
We now at least import other pack settings like pack.window, which
means we can later use these to control how we search for deltas.

The compression level was fixed to use pack.compression rather than
the loose object core.compression setting.

Change-Id: I72ff6d481c936153ceb6a9e485fa731faf075a9a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 19:00:46 -07:00
Robin Rosenberg a1492f1922 Say that commit is allowed during bisect
C Git allows this and it is quite handy.

Change-Id: I1d0238b43fca931ad2079649fb7b431e2815c351
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-07-10 02:32:46 +02:00
Shawn O. Pearce 2f93a09dd1 Save object path hash codes during packing
We need to remember these so we can later cluster objects that
have similar file paths near each other as we search for deltas
between them.

Change-Id: I52cb1e4ca15c9c267a2dbf51dd0d795f885f4cf8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 15:17:26 -07:00
Shawn O. Pearce c20daa7314 Add path hash code to ObjectWalk
PackWriter wants to categorize objects that are similar in path name,
so blobs that are probably from the same file (or same sort of file)
can be delta compressed against each other.  Avoid converting into
a string by performing the hashing directly against the path buffer
in the tree iterator.

We only hash the last 16 bytes of the path, and we try avoid any
spaces, as we want the suffix of a file such as ".java" to be more
important than the directory it is in, like "src".

Change-Id: I31770ee711526306769a6f534afb19f937e0ba85
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 10:37:47 -07:00
Shawn O. Pearce b584cb8754 Add getObjectSize to ObjectReader
This is an informational function used by PackWriter to help it
better organize objects for delta compression.  Storage systems
can implement it to provide up more detailed size information,
or they can simply rely on the default behavior that uses the
ObjectLoader obtained from open.

For local file storage, we can obtain this information faster
through specialized routines that parse a pack object header.

Change-Id: I13a09b4effb71ea5151b51547f7d091564531e58
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 10:37:47 -07:00
Shawn O. Pearce 97311cd3e0 Allow TemporaryBuffer.Heap to allocate smaller than 8 KiB
If the heap limit was set to something smaller than 8 KiB, we were
still allocating the full 8 KiB block size, and accepting up to
the amount we allocated by.  Instead actually put a hard cap on
the limit.

Change-Id: Id1da26fde2102e76510b1da4ede8493928a981cc
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-09 10:37:47 -07:00
Matthias Sohn b8f2bb7d2a Add support for updateNeeded flag in DirCacheEntry
Change-Id: If06ff41d9ccd422afbc79ecbc3cfdf8bb2508dcd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-07-09 14:12:06 +02:00
Jeff Schumacher a8b29afd82 Create FileHeader from DiffEntry
Added support for converting DiffEntrys to FileHeaders. FileHeaders
are DiffEntrys with a buffer containing the diff output as well as
a list of HunkHeaders. The HunkHeaders contain EditLists. The
createFileHeader(DiffEntry) method in DiffFormatter performs a Myers
Diff on the files refered to by the DiffEntry, then puts the returned
EditList into a single HunkHeader, which is then put into the
FileHeader to be returned. It also generates the appropriate diff
header an puts it into the FileHeader's buffer. The rest of the diff
output, which would normally be parsed to generate the HunkHeaders,
is not generated. In fact, the purpose of this method is to avoid
the costly diff output generation and parsing normally required to
create a FileHeader.

Change-Id: I7d8b18c0f6c85e3d02ad58995d3d231e69af5887
2010-07-08 16:58:55 -07:00
Stefan Lay 354b90131a Fix javadoc typos in JGit API
There were some small errors which made it
difficult to read the JavaDoc.

Change-Id: Ib3b34353465162adebaca3514d596d0edf5aea51
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-08 10:42:29 +02:00
Shawn O. Pearce 711bd3e3d0 Define a constant for 127 in DeltaEncoder
The special value 127 here means how many bytes we can put into
a single insert command.  Rather than use the magical value 127,
lets name it to better document the code.

Change-Id: I5a326f4380f6ac87987fa833e9477700e984a88e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-07 09:52:09 -07:00
Shawn O. Pearce cd7dd8591e Cap delta copy instructions at 64k
Although all modern delta decoders can process copy instructions
with a count as large as 0xffffff (~15.9 MiB), pack version 2 streams
are only supposed to use delta copy instructions up to 64 KiB.

Rewrite our copy instruction encode loop to use the lower 64 KiB
limit, even though modern decoders would support longer copies.

To improve encoding performance we now try to encode up to four full
copy commands in our buffer before we flush it to the stream, but
we don't try to implement full buffering here.  We are just trying
to amortize the virtual method call to the destination stream when
we have to do a large copy.

Change-Id: I9410a16e6912faa83180a9788dc05f11e33fabae
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-07 09:52:09 -07:00
Shawn O. Pearce 384a19eee0 Deprecate all of the older Tree related code
We want to get rid of these APIs, because they don't perform as well
as DirCache/TreeWalk, or don't offer nearly as many features.

Bug: 319145
Change-Id: I2b28f9cddc36482e1ad42d53e86e9d6461ba3bfc
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-07 09:15:02 -07:00