Commit Graph

981 Commits

Author SHA1 Message Date
Chris Aniszczyk 18aadc826d Merge "Reduce compares in Edit.getType" 2010-09-06 22:33:09 -04:00
Shawn O. Pearce ba984ba2e0 Fix checkReferencedIsReachable to use correct base list
When checkReferencedIsReachable is set in ReceivePack we are trying
to prove that the push client is permitted to access an object that
it did not send to us, but that the received objects link to either
via a link inside of an object (e.g. commit parent pointer or tree
member) or by a delta base reference.

To do this check we are making a list of every potential delta base,
and then ensuring that every delta base used appears on this list.
If a delta base does not appear on this list, we abort with an error,
letting the client know we are missing a particular object.

Preventing spurious errors about missing delta base objects requires
us to use the exact same list of potential delta bases as the remote
push client used.  This means we must use TOPO ordering, and we
need to enable BOUNDARY sorting so that ObjectWalk will correctly
include any trees found during the enumeration back to the common
merge base between the interesting and uninteresting heads.

To ensure JGit's own push client matches this same potential delta
base list, we need to undo 60aae90d4d ("Disable topological
sorting in PackWriter") and switch back to using the conventional
TOPO ordering for commits in a pack file.  This ensures that our
own push client will use the same potential base object list as
checkReferencedIsReachable uses on the receiving side.

Change-Id: I14d0a326deb62a43f987b375cfe519711031e172
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-06 12:12:43 -07:00
Shawn O. Pearce 6f385076e1 Discard object bodies when checking connectivity
Since we are only checking the links between objects we don't need
to hold onto commit messages after their headers have been parsed
by the walker.  Dropping them saves a bit of memory, which is always
good when accepting huge pack files.

Change-Id: I378920409b6acf04a35cdf24f81567b1ce030e36
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-06 11:36:08 -07:00
Shawn O. Pearce 741659fed4 DeltaStream: Fix data corruption when reading large copies
If the copy instruction was larger than the input buffer given to us,
we copied the wrong part of the base stream during the next read().

This occurred on really big binary files where a copy instruction
of 64k wasn't unreasonable, but the caller's buffer was only 8192
bytes long.  We copied the first 8192 bytes correctly, but then
reseeked the base stream back to the start of the copy region on
the second read of 8192 bytes.  Instead of a sequence like ABCD
being read into the caller, we read AAAA.

Change-Id: I240a3f722a3eda1ce8ef5db93b380e3bceb1e201
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-06 10:09:12 -07:00
Shawn O. Pearce 693f454e71 Use 8192 as default buffer size in ObjectLoader copyTo
As ObjectStreams are supposed to be buffered, most implementors will
be wrapping their underlying stream inside of a BufferedInputStream
in order to satisfy this requirement.  Because developers are by
nature lazy, they will use the default buffer size rather than
specify their own.

The OpenJDk JRE implementations use 8192 as the default buffer
size, and when the higher level reader uses the same buffer size
the buffers "stack" nicely by avoiding a copy to the internal
buffer array.  As OpenJDK is a popular virtual machine, we should
try to benefit from this nice stacking property during copyTo().

Change-Id: I69d53f273b870b841ced2be2e9debdfd987d98f4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-06 10:09:12 -07:00
Chris Aniszczyk 88adf21c47 Merge "Add helper methods to Edit" 2010-09-06 13:04:51 -04:00
Shawn O. Pearce f395b84390 Merge "log: Fix commit headers and -p flag" 2010-09-06 12:06:17 -04:00
Shawn O. Pearce 2b0c5c7207 Merge "Use 5 MiB for RevWalk default limit" 2010-09-06 11:37:29 -04:00
Robin Rosenberg 8145e40233 cleanup: Remove unnecessary @SuppressWarnings
Change-Id: I1b239b587e1cc811bbd6e1513b07dc93a891a842
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-09-05 00:00:57 +02:00
Shawn O. Pearce 6938f99ef3 Reduce compares in Edit.getType
We can slightly optimize this method by removing some compares
based on knowledge of how the orderings have to work.  This way
a getType() invocation requires at most 2 int compares for any
result, vs. the 6 required to find REPLACE before.

Change-Id: I62a04cc513a6d28c300d1c1496a8608d5df4efa6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-03 23:55:46 -07:00
Shawn O. Pearce fe8fe13349 Add helper methods to Edit
Exposing isEmpty, getLengthA, getLengthB make it easier to examine
the state of an edit and work with it from higher level code.

The before and after cut routines make it easy to split an edit
that contains another edit, such as to decompose a REPLACE that
contains a common sequence within it.

Change-Id: Id63d6476a7a6b23acb7ab237d414a0a1a7200290
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-03 23:55:46 -07:00
Shawn O. Pearce 596c3f668b log: Fix commit headers and -p flag
We weren't flushing the commit message before the diff output, which
meant the headers and message showed randomly interleaved with the
diff rather than immediately before.

Change-Id: I6cefab8d40e9d40c937e9deb12911188fec41b26
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-03 23:55:45 -07:00
Shawn O. Pearce 2aa4196f1f Fix QuotedString.GIT_PATH escaping rules
We shouldn't escape non-special ASCII characters such as '@' or '~'.
These are valid in a path name on POSIX systems, and may appear as
part of a path in a GNU or Git style patch script.  Escaping them
into octal just obfuscates the user's intent, with no gain.

When parsing an escaped octal sequence, we must parse no more
than 3 digits.  That is, "\1002" is actually "@2", not the Unicode
character \u0202.

Change-Id: I3a849a0d318e69b654f03fd559f5d7f99dd63e5c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-03 13:02:01 -07:00
Shawn O. Pearce 34a755f1df Remove costly quoting test in DiffFormatter
QuotedString.GIT_PATH returns the input reference exactly if
the string does not require quoting, otherwise it returns a
copy that contains the quotes on either end, plus escapes in
the middle where necessary to meet conventions.

Testing the return against '"' + name + '"' is always false,
because GIT_PATH will never return it that way.  The only way
we have quotes on either end is if there is an escape in the
middle, in which case the string isn't equal anyway.

Change-Id: I4d21d8e5c7da0d7df9792c01ce719548fa2df16b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-03 12:59:53 -07:00
Shawn O. Pearce 33837e44c3 Merge branch 'unpack-error'
* unpack-error:
  ReceivePack: Rethrow exceptions caught during indexing

Change-Id: I0d0239d69cb5cd1a622bdee879978f0299e0ca40
2010-09-03 11:09:52 -07:00
Shawn O. Pearce 9239c10385 ReceivePack: Rethrow exceptions caught during indexing
If we get an exception while indexing the incoming pack, its likely
a stream corruption.  We already report an error to the client, but
we eat the stack trace, which makes debugging issues related to a
bug inside of JGit nearly impossible.  Rethrow it under a new type
UnpackException, so embedding servers or applications can catch the
error and provide it to a human who might be able to forward such
traces onto a JGit developer for evaluation.

Change-Id: Icad41148bbc0c76f284c7033a195a6b51911beab
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-03 10:57:55 -07:00
Shawn O. Pearce b505e2a558 Use 5 MiB for RevWalk default limit
Instead of getting the limit from CoreConfig, use the larger of the
reader's limit or 5 MiB, under the assumption that any annotated tag
or commit of interest should be under 5 MiB.  But if a repository
was really insane and had bigger objects, the reader implementation
can set its streaming limit higher in order to allow RevWalk to
still process it.

Change-Id: If2c15235daa3e2d1f7167e781aa83fedb5af9a30
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-02 11:38:40 -07:00
Shawn O. Pearce e29cd27961 Move ObjectDirectory streaming limit to WindowCacheConfig
IDEs like Eclipse offer up the settings in WindowCacheConfig to the
user as a global set of options that are configured for the entire
JVM process, not per-repository, as the cache is shared across the
entire JVM.  The limit on how much we are willing to allocate for
an object buffer is similar to the limit on how much we can use for
data caches, allocating that much space impacts the entire JVM and
not just a single repository, so it should be a global limit.

Change-Id: I22eafb3e223bf8dea57ece82cd5df8bfe5badebc
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-02 11:38:39 -07:00
Shawn O. Pearce fed508d55b diff: Default arguments to HEAD, working directory
Similar to C Git, default our difference when no trees are given
to us to something that makes a tiny bit of sense to the human.

We also now support the --cached flag, and have its meaning work the
same way as C Git.

Change-Id: I2f19dad4e018404e280ea3e95ebd448a4b667f59
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-02 11:38:39 -07:00
Shawn O. Pearce 59a262d5d2 Support creating the working directory difference
If the iterators passed into a diff formatter are working tree
iterators, we should enable ignoring files that are ignored, as
well as actually pull up the current content from the working tree
rather than getting it from the repository.

Because we abstract away the working directory access logic,
we can now actually support rename detection between the working
directory and the local repository when using a DiffFormatter.
This means its possible for an application to show an unstaged
delete-add pair as a rename if the add path is not ignored.
(Because the ignored file wouldn't show up in our difference output.)

Even more interesting is we can now do rename detection between any
two working trees, if both input iterators are WorkingTreeIterators.

Unfortunately we don't (yet) optimize for comparing the working
tree with the index involved so we can take advantage of cached
stat data to rule out non-dirty paths.

Change-Id: I4c0598afe48d8f99257266bf447a0ecd23ca7f5e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-02 11:38:39 -07:00
Shawn O. Pearce 6f00a3e651 Fix TreeWalk bug comparing DirCache and WorkingTree with ANY_DIFF
When comparing a DirCache and a WorkingTree using ANY_DIFF we
sometimes didn't recursive into a subtree of both sides gave us
zeroId() back for the identity of a subtree.  This happens when the
DirCache doesn't have a valid cache tree for the subtree, as then
it uses zeroId() for the ObjectId of the subtree, which then appears
to be equal to the zeroId() of the WorkingTreeIterator's subtree.

We work around this by adding a hasId() method that returns true
only if this iterator has a valid ObjectId.  The idEquals method
on TreeWalk than only performs a compare between two iterators if
both iterators have a valid id.

Change-Id: I695f7fafbeb452e8c0703a05c02921fae0822d3f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-02 11:38:39 -07:00
Shawn O. Pearce 244b1580b5 log, diff: Add --src-prefix, --dst-prefix, --no-prefix
Change-Id: I0c7154a51143d56362f12ee4fa93133778d3a9eb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-02 11:38:39 -07:00
Shawn O. Pearce ec2fdbf2ba Move rename detection, path following into DiffFormatter
Applications just want a quick way to configure our diff
implementation, and then just want to use it without a lot of fuss.

Move all of the rename detection logic and path following logic
out of our pgm package and into DiffFormatter itself, making it
much easier for a GUI to take advantage of the features without
duplicating a lot of code.

Change-Id: I4b54e987bb6dc804fb270cbc495fe4cae26c7b0e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-02 11:38:39 -07:00
Chris Aniszczyk 0f5eae53d6 Merge "Fix RepositoryState.MERGING" 2010-09-02 12:10:10 -04:00
Jens Baumgart f714988c61 Fix RepositoryState.MERGING
canResetHead now returns true.
Resetting mixed / hard works in EGit in merging state.

Change-Id: I1512145bbd831bb9734528ce8b71b1701e3e6aa9
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2010-09-02 18:01:47 +02:00
Chris Aniszczyk a2f57dd491 Merge "Add reset() to AbstractTreeIterator API" 2010-09-02 11:30:31 -04:00
Chris Aniszczyk 0fea04adfd Merge "Improve DiffFormatter text file access" 2010-09-02 11:29:58 -04:00
Chris Aniszczyk 097406ba5e Merge "Correct diff header formatting" 2010-09-02 11:28:33 -04:00
Chris Aniszczyk df0c9309c5 Merge "Remove duplicated code in DiffFormatter" 2010-09-02 11:20:43 -04:00
Christian Halstrick 2d71808ae0 Merge "Adding sorting to LongList" 2010-09-02 07:53:15 -04:00
Christian Halstrick f7f7c55bca Merge "Use int[] rather than IntList for RawText hashes" 2010-09-02 07:46:50 -04:00
Shawn O. Pearce 6b65211505 Adding sorting to LongList
Sorting the array can be useful when its being used as a map of pairs
that are appended into the array and then later merge-joined against
another array of similar semantics.

Change-Id: I2e346ef5c99ed1347ec0345b44cda0bc29d03e90
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-02 00:28:15 -07:00
Chris Aniszczyk 7a504b8d7c Merge "Add toString and improve Javadoc of NotIgnoredFilter" 2010-09-01 20:39:41 -04:00
Shawn O. Pearce 3fa7d3a2d2 Use int[] rather than IntList for RawText hashes
We know exactly how many lines we need by the time we compute our
per-line hashes, as we have already built the lines IntList to give
us the starting position of each line in the buffer.  Using that
we can properly size the array, and don't need the dynamic growing
feature of IntList.  So drop the indirection and just use a fixed
size array.

Change-Id: I5c8c592514692a8abff51e5928aedcf71e100365
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 16:54:20 -07:00
Chris Aniszczyk 38327a54a8 Refactor Git API exceptions to a new package
Create a new 'org.eclipse.jgit.api.errors' package to contain
exceptions related to using the Git porcelain API.

Change-Id: Iac1781bd74fbd520dffac9d347616c3334994470
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-09-01 15:27:43 -07:00
Shawn O. Pearce 028a613ced Add toString and improve Javadoc of NotIgnoredFilter
Today while debugging some TreeWalk related code I noticed this
filter did not have a toString(), making it harder to see what the
filter graph was at a glance in the debugger.  Add a toString()
for debugging to match other TreeFilters, and clean up the Javadoc
slightly so its a bit more clear about the purpose of the filter.

While we are mucking about with some of this code, simplify
the logic of include so its shorter and thus faster to read.
The pattern now more closely matches that of SkipWorkTreeFilter.

Change-Id: Iad433a1fa6b395dc1acb455aca268b9ce2f1d41b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 15:24:32 -07:00
Marc Strapetz ea4ff61ad3 IndexDiff honors Index entries' "skipWorkTree" flag.
Change-Id: I428d11412130b64fc46d7052011f5dff3d653802
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 15:19:22 -07:00
Shawn Pearce 97695c4ca3 Merge "Avoid double quotes in Git Config" 2010-09-01 18:02:44 -04:00
Shawn Pearce c7e1199b56 Merge "Add FS.detect() for detection of file system abstraction." 2010-09-01 17:59:55 -04:00
Shawn O. Pearce 408d4b5375 Add reset() to AbstractTreeIterator API
This allows callers to force the iterator back to its starting point,
so it can be traversed again.  The default way to do this is to use
back(1) until first() is true, but this isn't very efficient for any
iterator.  All current implementations have better ways to implement
reset without needing to seek backwards.

Change-Id: Ia26e6c852fdac8a0e9c80ac72c8cca9d897463f4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:43 -07:00
Shawn O. Pearce 9f323462be Improve DiffFormatter text file access
When we are asked to create a difference between two files the caller
really wants to see that output.  Instead of punting because a file
is too big to process, consider it to be binary.  This reduces the
accuracy of our output display, but makes it a lot more likely that
the formatter can still generate something semi-useful.

We set our default binary threshold to 50 MiB, which is the same
threshold that PackWriter uses before punting and deciding a file
is too big to delta compress.  Anything under this size we try to
load and process, anything over that size (or that won't allocate
in the heap) gets tagged as binary.

Change-Id: I69553c9ef96db7f2058c6210657f1181ce882335
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:43 -07:00
Shawn O. Pearce df8adefe86 Correct diff header formatting
When adding or deleting a file, we shouldn't ever prefix /dev/null
with the a/ or b/ prefixes.  Doing so is a mistake and confuses a
patch parser which handles /dev/null magically, while a/dev/null is
a file called null in the dev directory of the project.

Also when adding or deleting the "diff --git" line has the "real"
path on both sides, so we should see the following when adding the
file called foo:

  diff --git a/foo b/foo
  --- /dev/null
  +++ b/foo

The --- and +++ lines do not appear in a pure rename or copy delta,
C Git diff seems to omit these, so we now omit them as well.  We also
omit the index line when the ObjectIds are exactly equal.

Change-Id: Ic46892dea935ee8bdee29088aab96307d7ec6d3d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:43 -07:00
Shawn O. Pearce 797d5c4d40 Remove duplicated code in DiffFormatter
Instead of trying to stream out the header, we can drop a redundant
code path by formatting the header into a temporary buffer and then
streaming out the actual line differences later.

Its a small amount of unnecessary work to buffer the file header,
but these are typically very tiny so the cost to format and reparse
is relatively low.

Change-Id: Id14a527a74ee0bd7e07f46fdec760c22b02d5bdf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:43 -07:00
Shawn O. Pearce 1ea356b346 Move DiffFormatter default initialization to fields
Other fields in this class are initialized in their declaration, make
the code consistent with itself and use only one style.

Change-Id: I49a007e97ba52faa6b89f7e4b1eec85dccac0fa4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:42 -07:00
Shawn O. Pearce 9df493a318 Correct Javadoc of DiffFormatter class
This class does a lot more than just reflow a patch script, it now is
the primary means of creating a diff output.

Change-Id: I74467c9a53dc270ef8c84e7c75f388414ec8ba8f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:42 -07:00
Shawn O. Pearce efde2e3ea7 diff: Fix bad metaVar reference in --abbrev option
Change-Id: If92545b731ff80bff071aee9bbd852bbd187c7c5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:42 -07:00
Marc Strapetz a46abab244 Add FS.detect() for detection of file system abstraction.
To give the user more control on which file system abstraction
should be used on Windows, FS.detect() may be configured
to assume a Cygwin installation or nor.
2010-09-01 17:14:16 +02:00
Mathias Kinzler 2941d23e7e Avoid double quotes in Git Config
Currently, if a branch is created that has special chars ('#' in the bug),
Config will surround the subsection name with double quotes during
it's toText method which will result in an invalid file after saving the
Config.

Bug: 318249
Change-Id: I0a642f52def42d936869e4aaaeb6999567901001
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-09-01 09:13:19 +02:00
Marc Strapetz 6a05904e53 Extend DirCache test case to check "intent to add" flag. 2010-08-31 21:57:10 +02:00
Marc Strapetz 253b36d27a Partial support for index file format "3".
Extended flags are processed and available via DirCacheEntry's
new isSkipWorkTree() and isIntentToAdd() methods.  "resolve-undo"
information is completely ignored since its an optional extension.

Change-Id: Ie6e9c6784c9f265ca3c013c6dc0e6bd29d3b7233
2010-08-31 12:08:09 -07:00