Commit Graph

588 Commits

Author SHA1 Message Date
Chris Aniszczyk 38327a54a8 Refactor Git API exceptions to a new package
Create a new 'org.eclipse.jgit.api.errors' package to contain
exceptions related to using the Git porcelain API.

Change-Id: Iac1781bd74fbd520dffac9d347616c3334994470
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-09-01 15:27:43 -07:00
Shawn O. Pearce 028a613ced Add toString and improve Javadoc of NotIgnoredFilter
Today while debugging some TreeWalk related code I noticed this
filter did not have a toString(), making it harder to see what the
filter graph was at a glance in the debugger.  Add a toString()
for debugging to match other TreeFilters, and clean up the Javadoc
slightly so its a bit more clear about the purpose of the filter.

While we are mucking about with some of this code, simplify
the logic of include so its shorter and thus faster to read.
The pattern now more closely matches that of SkipWorkTreeFilter.

Change-Id: Iad433a1fa6b395dc1acb455aca268b9ce2f1d41b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 15:24:32 -07:00
Marc Strapetz ea4ff61ad3 IndexDiff honors Index entries' "skipWorkTree" flag.
Change-Id: I428d11412130b64fc46d7052011f5dff3d653802
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 15:19:22 -07:00
Shawn Pearce 97695c4ca3 Merge "Avoid double quotes in Git Config" 2010-09-01 18:02:44 -04:00
Shawn Pearce c7e1199b56 Merge "Add FS.detect() for detection of file system abstraction." 2010-09-01 17:59:55 -04:00
Shawn O. Pearce 408d4b5375 Add reset() to AbstractTreeIterator API
This allows callers to force the iterator back to its starting point,
so it can be traversed again.  The default way to do this is to use
back(1) until first() is true, but this isn't very efficient for any
iterator.  All current implementations have better ways to implement
reset without needing to seek backwards.

Change-Id: Ia26e6c852fdac8a0e9c80ac72c8cca9d897463f4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:43 -07:00
Shawn O. Pearce 9f323462be Improve DiffFormatter text file access
When we are asked to create a difference between two files the caller
really wants to see that output.  Instead of punting because a file
is too big to process, consider it to be binary.  This reduces the
accuracy of our output display, but makes it a lot more likely that
the formatter can still generate something semi-useful.

We set our default binary threshold to 50 MiB, which is the same
threshold that PackWriter uses before punting and deciding a file
is too big to delta compress.  Anything under this size we try to
load and process, anything over that size (or that won't allocate
in the heap) gets tagged as binary.

Change-Id: I69553c9ef96db7f2058c6210657f1181ce882335
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:43 -07:00
Shawn O. Pearce df8adefe86 Correct diff header formatting
When adding or deleting a file, we shouldn't ever prefix /dev/null
with the a/ or b/ prefixes.  Doing so is a mistake and confuses a
patch parser which handles /dev/null magically, while a/dev/null is
a file called null in the dev directory of the project.

Also when adding or deleting the "diff --git" line has the "real"
path on both sides, so we should see the following when adding the
file called foo:

  diff --git a/foo b/foo
  --- /dev/null
  +++ b/foo

The --- and +++ lines do not appear in a pure rename or copy delta,
C Git diff seems to omit these, so we now omit them as well.  We also
omit the index line when the ObjectIds are exactly equal.

Change-Id: Ic46892dea935ee8bdee29088aab96307d7ec6d3d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:43 -07:00
Shawn O. Pearce 797d5c4d40 Remove duplicated code in DiffFormatter
Instead of trying to stream out the header, we can drop a redundant
code path by formatting the header into a temporary buffer and then
streaming out the actual line differences later.

Its a small amount of unnecessary work to buffer the file header,
but these are typically very tiny so the cost to format and reparse
is relatively low.

Change-Id: Id14a527a74ee0bd7e07f46fdec760c22b02d5bdf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:43 -07:00
Shawn O. Pearce 1ea356b346 Move DiffFormatter default initialization to fields
Other fields in this class are initialized in their declaration, make
the code consistent with itself and use only one style.

Change-Id: I49a007e97ba52faa6b89f7e4b1eec85dccac0fa4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:42 -07:00
Shawn O. Pearce 9df493a318 Correct Javadoc of DiffFormatter class
This class does a lot more than just reflow a patch script, it now is
the primary means of creating a diff output.

Change-Id: I74467c9a53dc270ef8c84e7c75f388414ec8ba8f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-09-01 10:19:42 -07:00
Marc Strapetz a46abab244 Add FS.detect() for detection of file system abstraction.
To give the user more control on which file system abstraction
should be used on Windows, FS.detect() may be configured
to assume a Cygwin installation or nor.
2010-09-01 17:14:16 +02:00
Mathias Kinzler 2941d23e7e Avoid double quotes in Git Config
Currently, if a branch is created that has special chars ('#' in the bug),
Config will surround the subsection name with double quotes during
it's toText method which will result in an invalid file after saving the
Config.

Bug: 318249
Change-Id: I0a642f52def42d936869e4aaaeb6999567901001
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-09-01 09:13:19 +02:00
Marc Strapetz 253b36d27a Partial support for index file format "3".
Extended flags are processed and available via DirCacheEntry's
new isSkipWorkTree() and isIntentToAdd() methods.  "resolve-undo"
information is completely ignored since its an optional extension.

Change-Id: Ie6e9c6784c9f265ca3c013c6dc0e6bd29d3b7233
2010-08-31 12:08:09 -07:00
Shawn Pearce b3aa5802b9 Merge "DirCacheEntry: UPDATE_NEEDED should be in-core flag." 2010-08-31 14:29:03 -04:00
Marc Strapetz 80f4947e8b Fix RawParseUtils.formatBase10 to work with negative values
Change-Id: Iffa220de76c5e180796fa46c4d67f52a1b3b2e35
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-31 11:20:54 -07:00
Chris Aniszczyk b7465b8fe5 Remove deprecated PersonIdent constructor
Change-Id: I3831de1b6df25a52df30d367f0216573e6ee6b53
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-31 11:55:27 -05:00
Christian Halstrick 0c017188b4 Improve MergeAlgorithm to produce smaller conflicts
The merge algorithm was reporting conflicts which where to big.

Example: The common base was "ABC", the "ours" version contained
"AB1C" (the addition of "1" after pos 2) and the "theirs" version also
contained "AB1C". We have two potentially conflicting edits in the
same region which happen to bring in exactly the same content. This
should not be a conflict - but was previously reported as
"AB<<<1===1>>>C".

This is fixed by checking every conflicting chunk whether the
conflicting regions have a common prefix or suffix and by removing
this regions from the conflict.

Change-Id: I4dc169b8ef7a66ec6b307e9a956feef906c9e15e
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-08-31 17:14:07 +02:00
Marc Strapetz 2eb5426aa9 DirCacheEntry: UPDATE_NEEDED should be in-core flag.
In correspondance to CGit, UPDATE_NEEDED flag should not be
written to disk. Furthermore, it currently intersects CGit's
CE_EXTENDED flag.
2010-08-31 11:25:16 +02:00
Christian Halstrick 47f4171315 Let Resolve be the default Merge strategy
the merge command should use by default the "resolve" merge strategy.

Change-Id: I6c6973a3397cca12bd8a6bd950d04b1766a08b4c
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-08-31 01:29:49 +02:00
Christian Halstrick 45e79a526c Added merge strategy RESOLVE
This adds the first merge strategy to JGit which does real
content-merges if necessary. The new merge strategy "resolve" takes as
input three commits: a common base, ours and theirs. It will simply takeover
changes on files which are only touched in ours or theirs. For files
touched in ours and theirs it will try to merge the two contents
knowing taking into account the specified common base.

Rename detection has not been introduced for now.

Change-Id: I49a5ebcdcf4f540f606092c0f1dc66c965dc66ba
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-08-31 01:21:54 +02:00
Shawn O. Pearce e6bd689d2c Improve LargeObjectException reporting
Use 3 different types of LargeObjectException for the 3 major ways
that we can fail to load an object.  For each of these use a unique
string translation which describes the root cause better than just
the ObjectId.name() does.

Change-Id: I810c98d5691b74af9fc6cbd46fc9879e35a7bdca
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-30 11:53:25 -07:00
Shawn O. Pearce a3945d1bc8 IndexPack: Use byte limited form of getCachedBytes
Currently our algorithm requires that we have the delta base as
a contiguous byte array... but getCachedBytes() might not work
if the object is considered to be large by its underlying loader.
Use the limited form to obtain the object as a byte array instead.

Change-Id: I33f12a8811cb6a4a67396174733f209db8119b42
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-30 11:40:01 -07:00
Shawn O. Pearce 1709800f27 Undo translation of protocol string 'unpack error'
This string is part of the network protocol, and isn't meant to
be translated into another language.  Clients actually scan for
the string "unpack error " off the wire and react magically to
this information.  If it were translated, they would instead have
a protocol exception, which isn't very useful when there is already
an error occurring.

Change-Id: Ia5dc8d36ba65ad2552f683bb637e80b77a7d92f0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-30 10:58:25 -07:00
Shawn O. Pearce bc0359c42f Merge "Buffer very large delta streams to reduce explosion of CPU work" 2010-08-28 20:31:24 -04:00
Robin Rosenberg 236899204a Revert "Hide Maven target directories from Eclipse"
This reverts commit db4c516f67 since
it breaks compatibility with Eclipse 3.5 which can no longer import
the projects

Bug: 323390
Change-Id: I3cc91364a6747cfcb4c611a9be5258f81562f726
2010-08-28 09:50:50 +02:00
Shawn O. Pearce b24f907e3e Buffer very large delta streams to reduce explosion of CPU work
Large delta streams are unpacked incrementally, but because a delta
can seek to a random position in the base to perform a copy we may
need to inflate the base repeatedly just to complete one delta.
So work around it by copying the base to a temporary file, and then
we can read from that temporary file using random seeks instead.
Its far more efficient because we now only need to inflate the
base once.

This is still really ugly because we have to dump to a temporary
file, but at least the code can successfully process a large
file without throwing OutOfMemoryError.  If speed is an
issue, the user will need to increase the JVM heap and ensure
core.streamFileThreshold is set to a higher value, so we don't use
this code path as often.

Unfortunately we lose the "optimization" of skipping over portions
of a delta base that we don't actually need in the final result.
This is going to cause us to inflate and write to disk useless
regions that were deleted and do not appear in the final result.
We could later improve on our code by trying to flatten delta
instruction streams before we touch the bottom base object, and
then only store the portions of the base we really need for the
final result and that appear out-of-order.  Since that is some
pretty complex code I'm punting on it for now and just doing this
simple whole-object buffering.

Because the process umask might be permitting other users to read
files we create, we put the temporary buffers into $GIT_DIR/objects.
We can reasonably assume that if a reader can read our temporary
buffer file in that directory, they can also read the base pack
file we are pulling it from and therefore its not a security breach
to expose the inflated content in a file.  This requires a reader
to have write access to the repository, but only if the file is
really big.  I'd rather err on the side of caution here and refuse
to read a very big file into /tmp than to possibly expose a secured
content because the Java 5 JVM won't let us create a protected
temporary file that only the current user can access.

Change-Id: I66fb80b08cbcaf0f65f2db0462c546a495a160dd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-27 13:28:35 -07:00
Chris Aniszczyk f54e883566 Add TagCommand
A tag command is added to the Git porcelain API. Tests were
also added to stress test the tag command.

Change-Id: Iab282a918eb51b0e9c55f628a3396ff01c9eb9eb
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-27 21:11:31 +02:00
Christian Halstrick 2059ed205e Implement a Dircache checkout (needed for merge)
Implementation of a checkout (or 'git read-tree') operation which
works together with DirCache. This implementation does similar things
as WorkDirCheckout which main problem is that it works with deprecated
GitIndex. Since GitIndex doesn't support multiple stages of a file
which is required in merge situations this new implementation is
required to enable merge support.

Change-Id: I13f0f23ad60d98e5168118a7e7e7308e066ecf9c
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-27 16:06:49 +02:00
Christian Halstrick 0e7a38b60f Add getBaseCommit() to Merger
The Merger was was only exposing the merge base as an
AbstractTreeIterator. Since we need the merge base as
RevCommit to generate the merge result I expose it here.

Change-Id: Ibe846370a35ac9bdb0c97ce2e36b2287577fbcad
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-26 18:52:26 -07:00
Shawn Pearce 7d9bfa390f Merge "Fix parsing of multiple authors in PersonIdent." 2010-08-26 15:00:58 -04:00
Chris Aniszczyk d1edd00f56 Run formatter on edited lines via save action
Updates the project level settings to run the formatter
on save on only on the edited lines.

Change-Id: I26dd69d0c95e6d73f9fdf7031f3c1dbf3becbb79
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-26 12:33:09 -05:00
Marc Strapetz 80c622c49c Fix parsing of multiple authors in PersonIdent.
PersonIdent should be parsable for an invalid commit which
contains multiple authors, like "A <a@a.org>, B <b@b.org>".
PersonIdent(String) constructor now delegates to
RawParseUtils.parsePersonIdent().

Change-Id: Ie9798d36d9ecfcc0094ca795f5a44b003136eaf7
2010-08-26 12:58:03 +02:00
Shawn O. Pearce cb0c05b5b4 Increase the default streaming threshold to 15 MiB
Applying deltas in the large streaming mode is horrifically slow.
Trying to pack icu4c is impossible because a single 11 MiB file
sits on top of a 15 MiB file though a 10 deep delta chain, which
results in this very slow inflate process.

Upping the default limit to 15 MiB lets us process this large in a
reasonable time, but its still sufficiently low enough to prevent
exploding the heap of a very large process like Eclipse or Gerrit
Code Review.

We have to revisit the streaming delta application process and do
something much smarter, like flatten the delta chain before we apply
it to the base.  But even that is ugly, I've seen a 155 MiB delta
sitting on top of a 450 MiB file to produce a 300 MiB result object.
If the chain is deep, we may have trouble flatting it down.

Change-Id: If5a0dcbf9d14ea683d75546f104b09bb8cd8fdbb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce 7a9edb3662 Fix reuse from pack file for REF_DELTA types
We miscomputed the CRC32 checksum for a REF_DELTA type of object, by
not including the full 20 byte ObjectId of the delta base in the CRC
code we use when the delta is too large to go through our two faster
small reuse code paths.  This resulted in a corruption error during
packing, where the PackFile erroneously suspected the data was wrong
on the local filesystem and aborted writing, because the CRC didn't
match what we had read from the index.

Change-Id: I7d12cdaeaf2c83ddc11223ce0108d9bd6886e025
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce 3a972f8664 Cleanup and correct resolve Javadoc
We didn't fully cover what we support and what we don't.  It was
also a bit hard to follow the syntaxes supported.  Clean that up
by documenting it.

Change-Id: I7b96fa6cbefcc2364a51f336712ad361ae42df2d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce dbd2d7c83b Support parsing commit:path style blob references
We can now resolve expressions that reference a path within a
commit, designating a specific revision of a specific tree or
file in the project.

Change-Id: Ie6a8be629d264d72209db894bd680c5900035cc0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce 8da17c5046 Support parsing git describe style output
We now match on the -gABBREV style output created by git describe
when its describing a non-tagged commit, and resolve that back to
the full ObjectId using the abbreviation resolution feature that
we already support.

Change-Id: Ib3033f9483d9e1c66c8bb721ff48d4485bcdaef1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce 401d3b2cc1 Throw AmbiguousObjectException during resolve if its ambiguous
Its wrong to return null if we are resolving an abbreviation and we
have proven it matches more than one object.  We know how to resolve
it if we had more nybbles, as there are two or more objects with the
same prefix.  Declare that to the caller quite clearly by giving them
an AmbiguousObjectException.

Change-Id: I01bb48e587e6d001b93da8575c2c81af3eda5a32
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:12 -07:00
Shawn O. Pearce c44495fa2f Complete an abbreviation when formatting a patch
If we are given a DiffEntry header that already has abbreviated
ObjectIds on it, we may still be able to resolve those locally and
output the difference.  Try to do that through the new resolve API
on ObjectReader.

Change-Id: I0766aa5444b7b8fff73620290f8c9f54adc0be96
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:12 -07:00
Shawn O. Pearce 127a5f95e1 Use limited getCachedBytes in RevWalk
Parsing is rewritten to use the size limited form of getCachedBytes,
thus freeing the revwalk infrastructure from needing to care about
a large object vs. a small object when it gets an ObjectLoader.

Right now we hardcode our upper bound for a commit or annotated
tag to be 15 MiB.  I don't know of any that is more than 1 MiB in
the wild, so going 15x that should give us some reasonable headroom.

Change-Id: If296c211d8b257d76e44908504e71dd9ba70ffa8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:12 -07:00
Shawn O. Pearce c11711f98e Use limited getCachedBytes code to reduce duplication
Rather than duplicating this block everywhere, reuse the limited size
form of getCachedBytes to acquire the content of an object.

Change-Id: I2e26a823e6fd0964d8f8dbfaa0fc2e8834c179c1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-25 19:05:53 -05:00
Shawn O. Pearce 2292655e9e Add brute force byte array loading to ObjectLoader
Some algorithms are coded in a way that requires us to provide them
the entire object contents as a contiguous byte array.  The parsers
in RevCommit and RevTag, or our RawText objects are really good
examples of these.

Instead of duplicating this logic everywhere, lets put it into the
base ObjectLoader type.  That way the caller only needs to give us
their upper size bound, and we'll do the rest of the heavy work to
figure out if the object still fits within that bound, and get them
an array that has the complete contents.

Change-Id: Id95a7f79d2b97e39f6949370ccca2f2c9cfb1a0f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-25 19:03:47 -05:00
Chris Aniszczyk e69dcc703d Merge "Add ObjectId to the LargeObjectException" 2010-08-25 19:54:32 -04:00
Shawn O. Pearce 1f4b48a37c Add ObjectId to the LargeObjectException
A chunk of code that throws LargeObjectException may or may not have
the specific ObjectId on hand when its thrown.  If it does, we want
to cache it in the exception, and put that in the message.  If it is
missing we want to be able to set it later from a higher level stack
frame that does have the object handy.

Change-Id: Ife25546158868bdfa886037e4493ef8235ebe4b9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-25 18:54:07 -05:00
Chris Aniszczyk f74d474f3c Merge "Don't copy more than the object size" 2010-08-25 19:52:36 -04:00
Chris Aniszczyk 595a20a064 Merge "Use the ObjectStream size during copyTo" 2010-08-25 19:50:43 -04:00
Benjamin Muskalla 700b8b4514 Fixed typo in DirCache documentation
Change-Id: Ifc2e9047a45d57829fce59c66618e5de9120a5bb
Signed-off-by: Benjamin Muskalla <bmuskalla@eclipsesource.com>
2010-08-25 15:52:18 +02:00
Shawn O. Pearce 7cfe2f12ff Don't copy more than the object size
If the loader's stream is broken and returns to us more content
than it originally declared as the size of the object, don't
copy that onto the output stream.  Instead throw EOFException
and abort fast.  This way we don't follow an infinite stream,
but instead will at least stop when the size was reached.

Change-Id: I7ec0c470c875f03b1f12a74a9b4d2f6e73b659bb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-24 17:37:07 -07:00
Shawn O. Pearce b474de1da3 Use the ObjectStream size during copyTo
If the stream is a delta decompression stream, getting the size
can be expensive.  Its cheaper to get it from the stream itself
rather than from the object loader.

Change-Id: Ia7f0af98681f6d56ea419a48c6fa8eea09274b28
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-24 17:37:07 -07:00
Shawn O. Pearce 1c3f3fdbd2 Fix ObjectDirectory abbreviation resolution to notice new packs
If we can't resolve an abbreviation, it might be because there is
a new pack file we haven't picked up yet.  Try scanning the packs
again and recheck each pack if there were differences from the last
scan we did.

Because of this, we don't have to open a pack during the test where
we generate a pack on the fly.  We'll miss on the first loop during
which the PackList is the NO_PACKS magic initialization constant,
and pick up the newly created index during this retry logic.

Change-Id: I7b97efb29a695ee60c90818be380f7ea23ad13a3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-24 17:37:07 -07:00
Shawn O. Pearce a5c18fcfc7 Fully implement SHA-1 abbreviations
ObjectReader implementations are now responsible for creating the
unique abbreviation of an ObjectId, or for resolving an abbreviation
back to its full form.  In this latter case the reader can offer up
multiple candidates to the caller, who may be able to disambiguate
them based on context.

Repository.resolve() doesn't take multiple candidates into account
right now, but it could in the future by looking for a remaining
^0 or ^{commit} suffix and take an expansion if there is only one
commit that matches the input abbreviation.  It could also use
the distance from an annotated tag to resolve "tag-NNN-gcommit"
style strings that are often output by `git describe`.

Change-Id: Icd3250adc8177ae05278b858933afdca0cbbdb56
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-23 15:53:11 -07:00
Shawn O. Pearce 32466c33ba Delete deprecated ObjectWriter
ObjectWriter is a deprecated API that people shouldn't be using.
So get rid of it in favor of the ObjectInserter API.

Change-Id: I6218bcb26b6b9ffb64e3e470dba5dca2e0a62fd4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-23 10:59:30 -07:00
Shawn O. Pearce 9d5b926ed1 Add openEntryStream to WorkingTreeIterator
This makes it easier for abstract tools like AddCommand to open the
file from the working tree, without knowing internal details about
how the tree is managed.

Change-Id: Ie64a552f07895d67506fbffb3ecf1c1be8a7b407
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-23 10:30:58 -07:00
Shawn O. Pearce edd8029558 Add setLength(long) to DirCacheEntry
Applications should favor the long style interface, especially when
their source input is a long type, e.g. coming from java.io.File.
This way when the index format is later changed to support a
larger file size than 2 GiB we can handle it by just changing the
entry code, and not need to fix a lot of applications.

Change-Id: I332563caeb110014e2d544dc33050ce67ae9e897
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-23 10:29:50 -07:00
Shawn O. Pearce 6df5d3397c Move commit and tag formatting to CommitBuilder, TagBuilder
These objects should be responsible for their own formatting,
rather than delegating it to some obtuse type called ObjectInserter.

While we are at it, simplify the way we insert these into a database.
Passing in the type and calling format in application code turned
out to be a huge mistake in terms of ease-of-use of the insert API.

Change-Id: Id5bb95ee56aa2a002243e9b7853b84ec8df1d7bf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-23 10:13:29 -07:00
Shawn O. Pearce 22b285695a Rename Commit, Tag to CommitBuilder, TagBuilder
Since these types no longer support reading, calling them a Builder
is a better description of what they do.  They help the caller to
build a commit or a tag object.

Change-Id: I53cae5a800a66ea1721b0fe5e702599df31da05d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-23 09:46:14 -07:00
Shawn O. Pearce 6a51d97948 Add documentation explaining how to read Commit and Tag
Since we stopped supporting these types for reading, but their
name is a natural candidate for someone to try and use in code,
explain where they should be looking instead.

Change-Id: I091a1b0ef71b842016020f938ba3161431aab9c9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-23 09:40:41 -07:00
Christian Halstrick 5fc990130b Improved creation of JGitInternalException
There where 3 cases where a JGitInternalExcption was created
without specifying the root cause. This has been fixed.

Change-Id: I2ee08d04732371cd9e30874b1437b61217770b6a
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-08-23 10:20:43 +02:00
Marc Strapetz e2e38792b5 Perform automatic CRLF to LF conversion during WorkingTreeIterator
WorkingTreeIterator now optionally performs CRLF to LF conversion for
text files.  A basic framework is left in place to support enabling
(or disabling) this feature based on gitattributes, and also to
support the more generic smudge/clean filter system.  As there is
no gitattribute support yet in JGit this is left unimplemented,
but the mightNeedCleaning(), isBinary() and filterClean() methods
will provide reasonable places to plug that into in the future.

[sp: All bugs inside of WorkingTreeIterator are my fault, I wrote
     most of it while cherry-picking this patch and building it on
     top of Marc's original work.]

CQ: 4419
Bug: 301775
Change-Id: I0ca35cfbfe3f503729cbfc1d5034ad4abcd1097e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 18:03:07 -07:00
Shawn O. Pearce 2b23aac1c0 Expose pack fetch/push connections for subclassing
These classes need to be visible if an application wants to define
its own native pack based protocol embedded within another layer,
much like we already support for smart HTTP.

Change-Id: I7e2ac3ad01d15b94d340128a395fe0b2f560ff35
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:59:36 -07:00
Shawn O. Pearce 28ba4747bc Allow ObjectReuseAsIs to have more control over write ordering
The reuse system used by an object database may be able to benefit
from knowing what objects are coming next, and even improve data
throughput by delaying (or moving up) objects that are stored near
each other in the source database.

Pushing the iteration down into the reuse code makes it possible
for a smarter implementation to aggregate reuse.  But for the
standard pack file format on disk we don't bother, its quite
efficient already.

Change-Id: I64f0048ca7071a8b44950d6c2a5dfbca3be6bba6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:59:36 -07:00
Shawn O. Pearce fe18e52195 Allow ObjectToPack subclasses to use up to 4 bits of flags
Some instances may benefit from having access to memory efficient
storage for some small values, like single flag bits.  Give up a
portion of our delta depth field to make 4 bits available to any
subclass that wants it.

This still gives us room for delta chains of 1,048,576 objects,
and that is just insane.  Unpacking 1 million objects to get to
something is longer than most users are willing to wait for data
from Git.

Change-Id: If17ea598dc0ddbde63d69a6fcec0668106569125
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce f048af3fd1 Implement async/batch lookup of object data
An ObjectReader implementation may be very slow for a single object,
but yet support bulk queries efficiently by batching multiple small
requests into a single larger request.  This easily happens when the
reader is built on top of a database that is stored on another host,
as the network round-trip time starts to dominate the operation cost.

RevWalk, ObjectWalk, UploadPack and PackWriter are the first major
users of this new bulk interface, with the goal being to support an
efficient way to pack a repository for a fetch/clone client when the
source repository is stored in a high-latency storage system.

Processing the want/have lists is now done in bulk, to remove
the high costs associated with common ancestor negotiation.

PackWriter already performs object reuse selection in bulk, but it
now can also do the object size lookup and object counting phases
with higher efficiency.  Actual object reuse, deltification, and
final output are still doing sequential lookups, making them a bit
more expensive to perform.

Change-Id: I4c966f84917482598012074c370b9831451404ee
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce 11a5bef8b1 Offer ObjectReaders advice about a RevWalk
By giving the reader information about the roots of a revision
traversal, some readers may be able to prefetch information from
their backing store using background threads in order to reduce
data access latency.  However this isn't typically necessary so
the default reader implementation doesn't react to the advice.

Change-Id: I72c6cbd05cff7d8506826015f50d9f57d5cda77e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce b85af06324 Allow object reuse selection to occur in parallel
ObjectReader implementations may wish to use multiple threads in
order to evaluate object reuse faster.  Let the reader make that
decision by passing the iteration down into the reader.

Because the work is pushed into the reader, it may need to locate a
given ObjectToPack given its ObjectId.  This can easily occur if the
reader has sent a list of ObjectIds to the object database and gets
back information keyed only by ObjectId, without the ObjectToPack
handle.  Expose lookup using the PackWriter's own internal map,
so the reader doesn't need to build a redundant copy to track the
assocation of ObjectId back to ObjectToPack.

Change-Id: I0c536405a55034881fb5db92a2d2a99534faed34
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce cc6210619b Flush the pack header as soon as its ready
When the output stream is deeply buffered (e.g. 1 MiB or more in
an HTTP servlet on some containers) trying to kick out the header
earlier will prevent the client from stalling hard while the first
1 MiB is received and it can process the pack header.  Forcing a
flush here lets the client see the header and start its progress
monitor for "Receiving objects: (1/N)" so the user knows there
is still activity occurring, even though the buffering may cause
there to be some lag as the buffer fills up on the sending side.

Change-Id: I3edf39e8f703fe87a738dc236d426b194db85e3a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce de78cf3367 Export the ObjectId on MissingObjectException
Callers catching a MissingObjectException may need programmatic
access to the ObjectId that wasn't available in the repository.

Change-Id: I2be0380251ebe7e4921fa74e246724e48ad88b0e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce 69f8fa31be Expose OBJ_ANY in ObjectReader
Storage implementations or application code using an ObjectReader
may want to access this constant without being inside of a subclass
of the reader.

Change-Id: I6c871a03d5846b9bb899de4d14a265e8b204d8e0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce 109c695936 Expose getType in ObjectToPack
Storage implementations may find this useful when implementing the
ObjectReuseAsIs interface on their ObjectReader.  Expose it so we
don't force them to create a redundant copy of the information.

Change-Id: I802ec8113c00884fccde5d0e92b9849716316f62
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce d1ebc4aa00 Add copyTo(ByteBuffer) to AnyObjectId
Change-Id: I3572f6113db883002f9c3a5ecc1bcc8370105c98
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce 8878d301ac Add copyTo(byte[], int) to AnyObjectId
This permits formatting in hex into an existing byte array
supplied by the caller, and mirrors our copyRawTo method
with the same parameter signature.

Change-Id: Ia078d83e338b09b903bfd2d04284e5283f885a19
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:41:27 -07:00
Shawn O. Pearce 540df6c9fe Add a public RevTag.parse() method
Callers might have a canonical tag encoding on hand that they
wish to convert into a clean structure for presentation purposes,
and the object may not be available in a repository.  (E.g. maybe
its a "draft" tag being written in an editor.)

Change-Id: I387a462afb70754aa7ee20891e6c0262438fdf32
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:38:53 -07:00
Shawn O. Pearce b205597b91 Add a public RevCommit.parse() method
Callers might have a canonical commit encoding on hand that they
wish to convert into a clean structure for presentation purposes,
and the object may not be available in a repository.  (E.g. maybe
its a "draft" commit being written in an editor.)

Change-Id: I21759cff337cbbb34dbdde91aec5aa4448a1ef37
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:38:53 -07:00
Shawn O. Pearce 707912b35d Make Tag class only for writing
The Tag class now only supports the creation of an annotated tag
object.  To read an annotated tag, applictions should use RevTag.
This permits us to have exactly one implementation, and RevTag's
is faster and more bug-free.

Change-Id: Ib573f7e15f36855112815269385c21dea532e2cf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:38:53 -07:00
Shawn O. Pearce b46b635c03 Make Commit class only for writing
The Commit class now only supports the creation of a commit object.
To read a commit, applictions should use RevCommit.  This permits
us to have exactly one implementation, and RevCommit's is faster
and more bug-free.

Change-Id: Ib573f7e15f36855112815269385c21dea532e2cf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:38:52 -07:00
Shawn O. Pearce cf9537c8ce Correct PersonIdent hashCode() and equals() to ignore milliseconds
Git doesn't store millisecond accuracy in person identity lines,
so a line that we create in Java and round-trip through a Git object
wouldn't compare as being equal.  Truncate to seconds when comparing
values to ensure the same identity is equal.

Change-Id: Ie4ebde64061f52c612714e89ad34de8ac2694b07
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:38:52 -07:00
Shawn O. Pearce 746ebda381 Try really hard to load a commit or tag
When we need the canonical form of a commit or a tag in order to
parse it into our RevCommit or RevTag fields, we really need it as a
single contiguous byte array.  However the ObjectDatabase may choose
to give us a large loader.  In general commits or tags are always
under the several MiB limit, so even if the loader calls it "large"
we should still be able to afford the JVM heap memory required to
get a single byte array.  Coerce even large loaders into a single
byte array anyway.

Change-Id: I04efbaa7b31c5f4b0a68fc074821930b1132cfcf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-20 17:38:52 -07:00
Shawn O. Pearce 3820b0281a Fix formatting of serialization code in ObjectId
Change-Id: I5b3e99e9e658fe272a9e171db04b0f20e48ed8d3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-19 11:53:22 -07:00
Shawn O. Pearce 1c2290c8d6 Make ObjectId.compareTo final
Since equals() is now final and does not permit being overridden,
we should do the same thing with compareTo() to prevent different
subclasses from having different ordering behaviors.  This could
lead to the same mess that we had with different equals() behaviors.

Change-Id: I35a849b6efccee5fe74cc5788a3566a1516004b7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-19 11:50:20 -07:00
Shawn O. Pearce 5adcd708e4 Make ObjectId.hashCode final too
Since equals() is now final and does not permit being overridden,
we should do the same thing with hashCode() to prevent different
subclasses from having different hashing behaviors.  This could
lead to the same mess that we had with different equals() behaviors.

Change-Id: I35a849b6efccee5fe74cc5788a3566a1516004b7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-19 11:46:22 -07:00
Shawn O. Pearce d0043e5d31 Remove unnecessary ObjectId.copy() calls
When RevObject overrode equals() to provide only reference equality
we used to need to convert a RevObject into an ObjectId by copy()
just to use standard Java tools like JUnit assertEquals(), or to
use contains() or get() on standard java.util collection types.

Now that we have removed this override and made ObjectId's equals()
final (preventing any of this mess in the future), some copy()
calls are unnecessary.  Anytime the value is being used as an input
to a lookup routine, or to an equals, we can avoid the copy().

However we still want to use copy() anytime we are given an ObjectId
that may exist long-term, where we don't want the high cost of the
additional storage from a RevCommit extension.  So we can't remove
all uses of copy(), just some of them.

Change-Id: Ief275dace435c0ddfa362ac8e5d93558bc7e9fc3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-19 11:43:39 -07:00
Mathias Kinzler b7388637d8 Fix missing Configuration Change eventing
Configuration change events were not being triggered, now they are
forwarded from the FileConfig up to the Repository's listeners.

Change-Id: Ida94a59f5a2b7fa8ae0126e33c13343275483ee5
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-19 11:36:56 -07:00
Christian Halstrick 75c9b24385 Enhance MergeResult to report conflicts, etc
The MergeResult class is enhanced to report more data about a
three-way merge. Information about conflicts and the base, ours,
theirs commits can be retrived.

Change-Id: Iaaf41a1f4002b8fe3ddfa62dc73c787f363460c2
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-19 12:16:39 -05:00
Chris Aniszczyk 94ba9574cd Allow for optional tagger and message in Tag
We should be more lenient when tagging without an
tagger or message. Currently, we will throw an NPE
which is incorrect behavior.

Change-Id: I04e30ce25a9432e4ca56c3f29658ecb24fb18d24
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-18 19:22:42 -07:00
Chris Aniszczyk 6c9d82b4ce Remove getter and setter for author in Tag
There was a duplicated getter and setter for tagger in Tag.
There's no needed to have two getters and setters that represent
the same things. The appropriate tests were updated also.

Change-Id: If46dc00c4c0f31ea4234c6d3bda3c03e6ebbafac
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-18 20:58:25 -05:00
Christian Halstrick e02b68a8b7 added resetIndex() to RepositoryTestCase
Added a utility method to set the reset an index to match exactly
some content in the filesystem. This can be used by tests to prepare
commits in the working-tree and set the index in one shot.

[sp: Cleaned up formatting, added getEntryFile(), released inserter.]

Change-Id: If38b1f7cacaaf769f51b14541c5da0c1e24568a5
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-18 16:45:56 -07:00
Robin Rosenberg a85c08e1c8 Do not trigger RefsChangedEvent on the first attempt to read a ref
Such events make no sense, it has never been visible to this
process so no client can have a stale value of the ref.

Change-Id: Iea3a5035b0a1410b80b09cf53387b22b78b18018
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-08-18 16:32:45 -07:00
Ketan Padegaonkar 376acfb6db Add FileRepository(String) convenience constructor
Add a convenience API in FileRepository to pass in a String that
points to the GIT_DIR location.  This is converted to a File and
sent through the usual constructor.

Change-Id: I588388f37e89b8c690020f110a1bc59f46170c40
2010-08-18 16:30:20 -07:00
Matthias Sohn 2d3a806271 Backout RevObject's object-identity based equals implementation
This restores the transitivity and symmetry properties of the equals
methods on the AnyObjectId type hierarchy as defined in [1]. 
Following [2] we declare these equals methods final to ensure that 
semantics of equals are consistent across AnyObjectId's type hierarchy.

[1] http://download-llnw.oracle.com/javase/6/docs/api/java/lang/Object.html#equals(java.lang.Object)
[2] http://www.angelikalanger.com/Articles/JavaSolutions/SecretsOfEquals/Equals.html

Bug: 321502
Change-Id: Ibace21fa268c4aa15da6c65d42eb705ab1aa24b3
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-15 23:12:58 +02:00
Shawn Pearce 8d761febc3 Merge "Fix RevCommitList to work with subclasses of RevWalk" 2010-08-12 19:48:44 -04:00
Matthias Sohn 35b01dac4c Fix RevCommitList to work with subclasses of RevWalk
Bug: 321502
Change-Id: Ic4bc49a0da90234271aea7c0a4e344a1c3620cfc
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-13 01:47:17 +02:00
Jens Baumgart cd1141cd45 Improve IndexDiff performance
Exclude ignored files from IndexDiff tree walk.
This makes EGit commit much faster.

Change-Id: I398499510c22c37667b7612db32eac3b31d325f0
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-12 11:43:02 -05:00
Chris Aniszczyk cfe88d32a3 Merge "Hide Maven target directories from Eclipse" 2010-08-11 11:17:18 -04:00
Mathias Kinzler fe76b41038 TransportHttp does not honor timeout setting
This can result in an infinitely hanging IDE.

Change-Id: I669bc8d220a07011a42edf79de31825305ff3763
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-08-10 15:58:21 +02:00
Jens Baumgart 9a6a433576 Fix NPE on commit in empty Repository
NPE occured when committing in an empty repository.

Bug: 321858
Change-Id: Ibddb056c32c14c1444785501c43b95fdf64884b1
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2010-08-09 11:14:38 +02:00
Robin Rosenberg db4c516f67 Hide Maven target directories from Eclipse
Change-Id: I64f12a35423a90ced9c9bc83f6869d8ed766dd35
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-08-08 13:16:53 +02:00
Shawn O. Pearce 09130b8731 Merge branch 'rename-bug'
* rename-bug:
  Fix ArrayIndexOutOfBounds on non-square exact rename matrix

Conflicts:
	org.eclipse.jgit/src/org/eclipse/jgit/diff/RenameDetector.java

Change-Id: Ie0b8dd3e1ec174f79ba39dc4706bb0694cc8be29
2010-08-06 09:48:50 -07:00
Shawn O. Pearce e2f5716c94 Fix ArrayIndexOutOfBounds on non-square exact rename matrix
If the exact rename matrix for a particular ObjectId isn't square we
crashed with an ArrayIndexOutOfBoundsException because the matrix
entries were encoded backwards.  The encode function accepts the
source (aka deleted) index first, not second.  Add a unit test to
cover this non-square case to ensure we don't have this regression
in the future.

Change-Id: I5b005e5093e1f00de2e3ec104e27ab6820203566
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-06 09:39:10 -07:00
Shawn O. Pearce 8e9cc826e9 Merge changes I39bfefee,I47795987,I70d120fb,I58cc5e01,I96bee7b9
* changes:
  Enable configuration of non-standard pack settings
  Pass PackConfig down to PackWriter when packing
  Simplify UploadPack use of options during writing
  Move PackWriter configuration to PackConfig
  Allow PackWriter callers to manage the thread pool
2010-08-05 21:11:31 -04:00
Shawn O. Pearce 97e93ca1ea Merge "Remove static progress task names from PackWriter" 2010-08-05 21:11:30 -04:00
Chris Aniszczyk b69900a415 Merge "Add "all" parameter to the commit Command" 2010-08-05 15:02:59 -04:00
Chris Aniszczyk ad4274abcc Merge "Add the parameter "update" to the Add command" 2010-08-05 15:01:53 -04:00
Stefan Lay 4b464ed458 Allow to replace existing Change-Id
It is useful to be able to replace an existing Change-Id
in the message, for example if the user decides not to
amend the previous commit.

Bug: 321188
Change-Id: I594e7f9efd0c57d794d2bd26d55ec45f4e6a47fd
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-05 12:23:38 -05:00
Shawn O. Pearce 60c5939b23 Rename getOldName,getNewName to getOldPath,getNewPath
TreeWalk calls this value "path", while "name" is the stuff after the
last slash.  FileHeader should do the same thing to be consistent.
Rename getOldName to getOldPath and getNewName to getNewPath.

Bug: 318526
Change-Id: Ib2e372ad4426402d37939b48d8f233154cc637da
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-04 11:00:07 -07:00
Shawn O. Pearce 7514a6dbdc Merge branch 'js/diff'
* js/diff:
  Fixed bug in scoring mechanism for rename detection
2010-08-04 10:59:35 -07:00
Jeff Schumacher e64cb03065 Fixed bug in scoring mechanism for rename detection
A bug in rename detection would cause file scores to be wrong. The
bug was due to the way rename detection would judge the similarity
between files. If file A has three lines containing 'foo', and file
B has 5 lines containing 'foo', the rename detection phase should
record that A and B have three lines in common (the minimum of the
number of times that line appears in both files). Instead, it would
choose the the number of times the line appeared in the destination
file, in this case file B. I fixed the bug by having the
SimilarityIndex instead choose the minimum number, as it should. I
also added a test case to verify that the bug had been fixed.

Change-Id: Ic75272a2d6e512a361f88eec91e1b8a7c2298d6b
2010-08-04 10:56:19 -07:00
Jens Baumgart 3ba1c7c068 Add gitignore support to IndexDiff and use TreeWalk
IndexDiff was re-implemented and now uses TreeWalk instead
of GitIndex. Additionally, gitignore support and retrieval of
untracked files was added.

Change-Id: Ie6a8e04833c61d44c668c906b161202b200bb509
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-04 10:03:20 -05:00
Stefan Lay ab57af08e8 Add "all" parameter to the commit Command
When the add parameter is set all modified and deleted files
are staged prior to commit.

Change-Id: Id23bc25730fcdd151386cd495a7cdc0935cbc00b
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-08-04 13:53:08 +02:00
Stefan Lay fa7d9ac5b8 Add the parameter "update" to the Add command
This change is mainly done for a subsequent commit
which will introduce the "all" parameter to the Commit
command.

Bug: 318439
Change-Id: I85a8a76097d0197ef689a289288ba82addb92fc9
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-08-04 13:36:45 +02:00
Christian Halstrick 94207f0a43 Make use of Repository.writeMerge...()
The CommitCommand should not use java.io to delete MERGE_HEAD and MERGE_MSG
files since Repository already has utility methods for that.

Change-Id: If66a419349b95510e5b5c2237a91f06c1d5ba0d4
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-07-29 15:12:14 +02:00
Christian Halstrick fba2437111 Merge "Fix tag sorting in PlotWalk" 2010-07-28 17:13:27 -04:00
Shawn O. Pearce 5f5da8b1d4 Enable configuration of non-standard pack settings
For daemons we might want to disable delta compression entirely, or
in some strange case an administrator might need to turn of delta
reuse.  Expose these normally internal pack settings through the pack
configuration section.

Change-Id: I39bfefee8384c864cc04ffac724f197240c8a11a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 12:13:48 -07:00
Shawn O. Pearce 9fbce904e6 Pass PackConfig down to PackWriter when packing
When we are creating a pack the higher level application should be able
to override the PackConfig used, allowing it to control the number of
threads used or how much memory is allocated per writer.

Change-Id: I47795987bb0d161d3642082acc2f617d7cb28d8c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 12:13:48 -07:00
Shawn O. Pearce bb99ec0aa0 Simplify UploadPack use of options during writing
We only use these variables once, so just put them at the proper
use site and avoid assigning the local variable.  The code is a
bit shorter and the intent is a little bit more clear.

Change-Id: I70d120fb149b612ac93055ea39bc053b8d90a5db
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 12:13:48 -07:00
Shawn O. Pearce 1a06179ea7 Move PackWriter configuration to PackConfig
This refactoring permits applications to configure global per-process
settings for all packing and easily pass it through to per-request
PackWriters, ensuring that the process configuration overrides the
repository specific settings.

For example this might help in a daemon environment where the server
wants to cap the resources used to serve a dynamic upload pack
request, even though the repository's own pack.* settings might be
configured to be more aggressive.  This allows fast but less bandwidth
efficient serving of clients, while still retaining good compression
through a cron managed `git gc`.

Change-Id: I58cc5e01b48924b1a99f79aa96c8150cdfc50846
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 12:13:48 -07:00
Mathias Kinzler 6e59e6dab9 Meaningful error message when trying to check-out submodules
Currently, a NullPointerException occurs in this case. We should
instead throw a more meaningful Exception with a proper message.
This is a very "stupid" implementation which simply checks for
the existence of a ".gitmodules" file.

Bug: 300731
Bug: 306765
Bug: 308452
Bug: 314853
Change-Id: I155aa340a85cbc5d7d60da31dba199fc30689b67
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-07-28 11:59:07 -07:00
Christian Halstrick 08c0c5d938 Fix unit tests under windows
the following tests fail under windows because certain inputstreams
are not closed and files cannot be deleted because of that.  The
main problem I found is UnpackedObject.InflaterInputStream.close().
This method may throw exceptions found by checkValidEndOfStream()
but doesn't call super.close() before leaving. It is not clear to me
which resources a close() method should release before it throws an
exception. But those reseources which are not published to the
outside and which therefore cannot be closed by other means have to
be closed in all cases.
I changed the close() method to call super.close() under all
circumstances.

failing tests:
  testStandardFormat_LargeObject_TruncatedZLibStream(org.eclipse.jgit.storage.file.UnpackedObjectTest)
  testStandardFormat_LargeObject_TrailingGarbage(org.eclipse.jgit.storage.file.UnpackedObjectTest)
  testPackFormat_SmallObject(org.eclipse.jgit.storage.file.UnpackedObjectTest)

Change-Id: Id2e609a29e725aad953ff9bd88af6381df38399d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-07-28 11:55:11 -07:00
Shawn O. Pearce d0f8d1e819 Fix tag sorting in PlotWalk
By deferring tag sorting until the commit is produced by the walker
we can avoid an infinite loop that was triggered by trying to sort
tags while allocating a commit.  This also avoids needing to look
at commits which aren't going to be produced in the result.

Bug: 321103
Change-Id: I25acc739db2ec0221a50b72c2d2aa618a9a75f37
Reviewed-by: Mathias Kinzler <mathias.kinzler@sap.com>
Reviewed-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 11:51:17 -07:00
Shawn O. Pearce 21f76c2a69 Remove static progress task names from PackWriter
These need to be dynamic based on the current thread's environment
at time of execution in order to be properly localized for the end
user that will be seeing these messages.

Change-Id: I4976f462cfe606edd2761c0e36b2f6b20f63d53c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 10:50:28 -07:00
Shawn O. Pearce 1b783d0370 Allow PackWriter callers to manage the thread pool
By permitting the caller of PackWriter to select the Executor it
uses for task execution, we give the caller the ability to manage
the lifecycle of the thread pool, including reusing it across
concurrent pack generators.

This is the first step to supporting application thread pools
within Daemon or another managed service like Gerrit Code Review.

Change-Id: I96bee7b9c30ff9885f2bd261d0b6daaac713b5a4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 10:50:28 -07:00
Christian Halstrick 74d279fbf0 Teach NameConflictTreeWalk to report DF conflicts
Add a method isDirectoryFileConflict() to NameConflictTreeWalk which
tells whether the current path is part of a directory/file conflict.

Change-Id: Iffcc7090aaec743dd6f3fd1a333cac96c587ae5d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-28 17:26:31 +02:00
Mathias Kinzler 51c6f513b0 Stack Overflow in EGit History View
This is caused by a recursion in PlotWalk.getTags().
As a hotfix, the sort was simply removed. The sort
must be re-implemented so that parseAny() is not called
again (currently, this happens in the PlotRefComparator).

Change-Id: I060d26fda8a75ac803acaf89cfb7d3b4317328f3
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-07-28 11:46:05 +02:00
Jeff Schumacher 396fe6da45 Break dissimilar file pairs during diff
File pairs that are very dissimilar during a diff were not being
broken apart into their constituent ADD/DELETE pairs. The leads to
sub-optimal rename detection. Take, for example, this situation:

A file exists at src/a.txt containing "foo". A user renames src/a.txt
to src/b.txt, then adds a new src/a.txt containing "bar".

Even though the old a.txt and the new b.txt are identical, the
rename detection algorithm would not detect it as a rename since
it was already paired in a MODIFY. I added code to split all
MODIFYs below a certain score into their constituent ADD/DELETE
pairs. This allows situations like the one I described above to be
more correctly handled.

Change-Id: I22c04b70581f206bbc68c4cd1ee87a1f663b418e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-27 18:13:32 -07:00
Christian Halstrick f56a459966 Add methods which write MERGE_HEAD and MERGE_MSG
Add methods to the Repository class which write into MERGE_HEAD
and MERGE_MSG files. Since we have the read methods in the same
class this seems to be the right place.

Change-Id: I5dd65306ceb06e008fcc71b37ca3a649632ba462
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-27 11:48:23 -07:00
Jens Baumgart db82b8d7eb Fix concurrent read / write issue in LockFile on Windows
LockFile.commit fails if another thread concurrently reads
the base file. The problem is fixed by retrying the rename
operation if it fails.

Change-Id: I6bb76ea7f2e6e90e3ddc45f9dd4d69bd1b6fa1eb
Bug: 308506
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2010-07-27 10:00:47 -07:00
Robin Stocker a00377a7e2 Fix Javadoc warnings
There were some broken links, incorrect uses of @value, an invalid
tag and an outdated comment.

Change-Id: I22886bcc869a4b62bd606ebed40669f7b4723664
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-27 09:40:01 -07:00
Shawn O. Pearce 80fe789690 Make forPath(ObjectReader) variant in TreeWalk
This simplifies the logic for those who already have an ObjectReader
on hand want to reuse it to lookup a single path.

Change-Id: Ief17d6b2a0674ddb34bbc9f43121b756eae960fb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-27 08:36:24 -07:00
Shawn O. Pearce 7ff18f3ec9 Make StoredConfig an abstraction above FileBasedConfig
This exposes a load and save method, allowing a Repository to denote
that it has a persistent configuration of some kind which can be
accessed by the application, without needing to know exact details
of how its stored .

Change-Id: I7c414bc0f975b80f083084ea875eca25c75a07b2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-26 16:50:11 -07:00
Shawn O. Pearce fa9b225e06 Merge branch 'delta'
* delta: (103 commits)
  Discard the uncompressed delta as soon as its compressed
  Honor pack.windowlimit to cap memory usage during packing
  Honor pack.threads and perform delta search in parallel
  Cache small deltas during packing
  Implement delta generation during packing
  debug-show-packdelta:  Dump a pack delta to the console
  Initial pack format delta generator
  Add debugging toString() method to ObjectToPack
  Make ObjectToPack clearReuseAsIs signal available to subclasses
  Correctly classify the compressing objects phase
  Refactor ObjectToPack's delta depth setting
  Configure core.bigFileThreshold into PackWriter
  Add doNotDelta flag to ObjectToPack
  Add more configuration options to PackWriter
  Save object path hash codes during packing
  Add path hash code to ObjectWalk
  Add getObjectSize to ObjectReader
  Allow TemporaryBuffer.Heap to allocate smaller than 8 KiB
  Define a constant for 127 in DeltaEncoder
  Cap delta copy instructions at 64k
  ...

Conflicts:
	org.eclipse.jgit.pgm/src/org/eclipse/jgit/pgm/Diff.java
	org.eclipse.jgit/resources/org/eclipse/jgit/JGitText.properties
	org.eclipse.jgit/src/org/eclipse/jgit/JGitText.java
	org.eclipse.jgit/src/org/eclipse/jgit/revwalk/RewriteTreeFilter.java

Change-Id: I7c7a05e443a48d32c836173a409ee7d340c70796
2010-07-22 14:56:34 -07:00
Stefan Lay ab062caa22 Allow client of Add command to set a WorkingTreeIterator
This is e.g. useful when a client of the AddCommand has
additional rules to ignore files. In Eclipse a resource can
be set to derived or be excluded by preferences.

Change-Id: I6c47e54a1ce26315faf5ed0723298ad2c2db197c
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-22 14:57:00 +02:00
Stefan Lay 88957f6c5a Allow for filepattern "." in AddCommand
Enable adding on repository root level.

Change-Id: I415b10dc74cc9435578424d9f106c972fd703055
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-22 14:27:35 +02:00
Stefan Lay aa86cfc339 Do not add ignored files in Add command
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-22 11:26:04 +02:00
Shawn O. Pearce 09910ffa32 Move ignore node handling into WorkingTreeIterator
The working tree iterator has perfect knowledge of the path structure
as well as immediate information about whether or not an ignore file
even exists at this level.  We can exploit that to simplify the
logic and running time for testing ignored file status by pushing
all of the checks down into the iterator itself.

Change-Id: I22ff534853e8c5672cc5c2d9444aeb14e294070e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Charley Wang <chwang@redhat.com>
CC: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Stefan Lay <stefan.lay@sap.com>
CC: Matthias Sohn <matthias.sohn@sap.com>
2010-07-21 10:34:08 -07:00
Shawn Pearce 0ec0e21fdf Merge "Fix concurrent read / write issue in GitIndex on Windows" 2010-07-21 13:08:01 -04:00
Jens Baumgart e99c48a61a Fix concurrent read / write issue in GitIndex on Windows
GitIndex.write fails if another thread concurrently reads
the index file. The problem is fixed by retrying the rename
operation if it fails.

Bug: 311051
Change-Id: Ib243d2a90adae312712d02521de4834d06804944
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2010-07-21 09:35:15 +02:00
Christian Halstrick 5c94321b47 Check for racy git in WorkingTreeIterator
The WorkingTreeIterator has a method to check whether
the current file differs from the corresponding index
entry. This commit improves this check to also handle
racy git situations.

See http://git.kernel.org/?p=git/git.git;a=blob;f=Documentation/technical/racy-git.txt;hb=HEAD

Change-Id: I3ad0897211dcbb2eac9eebcb19d095a5052fb06b
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-07-20 21:55:18 +02:00
Christian Halstrick c98d97731b Smudge racily clean index entries by truncating length (like git.git)
To mark an entry racily clean we set its length to 0 (like native git
does). Entries which are not racily clean and have zero length can be
distinguished from racily clean entries by checking P_OBJECTID
against the SHA1 of empty content. When length is 0 and P_OBJECTID is
different from SHA1 of empty content we know the entry is marked
racily clean.

See http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg00488.html

Change-Id: I689552931441ab51964b430b303160c9126b66af
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-07-20 21:54:36 +02:00
Shawn O. Pearce 938943d674 Use proper constants for .gitignore and .git directory
We have a constant for .gitignore, so use it.  While we are in
the same method, correct the reference of ".git" to be the actual
GIT_DIR given.  This might not be within the work tree if the
GIT_DIR and GIT_WORK_TREE environment variables were used.

Change-Id: I38e1cec13405109b9c347858b38dd9fb2f1f2560
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Charley Wang <chwang@redhat.com>
CC: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Stefan Lay <stefan.lay@sap.com>
CC: Matthias Sohn <matthias.sohn@sap.com>
2010-07-20 09:11:39 -07:00
Shawn O. Pearce c59db09bc5 Remove gitIgnoreTimestamp from abstract iterator API
This never should have been exposed on the top of the
AbstractTreeIterator type hierarchy.  There is no concept of a
timestamp in a canonical tree read from the object database, and
the time in the DirCache isn't what we want here either.

Actually all that we need is to find the files whose names are
".gitignore" and are below the root directory.  We can accomplish
that with a suffix filter, and process them immediately.

Change-Id: Ib09cbf81a9e038452ce491385c65498312e2916b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Charley Wang <chwang@redhat.com>
CC: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Stefan Lay <stefan.lay@sap.com>
CC: Matthias Sohn <matthias.sohn@sap.com>
2010-07-20 09:09:01 -07:00
Shawn O. Pearce 395d236058 Fix NPE in RenameDetector
If we have two adds of the same object but no deletes the detector
threw an NPE because the entry that came back from the deleted map
was null (no matching objects).  In this case we need to put the
adds all back onto the list of left over additions since they did
not match a delete.

Change-Id: Ie68fbe7426b4dc0cb571a08911c7adbffff755d5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Jeffrey Schumacher" <jeffschu@google.com>
2010-07-20 07:52:35 -07:00
Shawn O. Pearce b518189b5c IndexPack: Fix spurious pack file corruption errors
We didn't correctly handle the zlib trailer for an object.  If the
trailer bytes were outside of the current buffer window but we had
fully inflated the object itself, we broke out of the loop (as we had
our target size) but inflate wasn't finished (as it did not yet get
the trailer) so we failed the test and threw a corruption exception.

Use an infinite loop and only break out when the inflater is done.

Change-Id: I7c9bbbeb577a990d9bc56a50ebd485935460f6c8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-20 07:40:48 -07:00
Shawn O. Pearce 12fe0f2d1e Discard the uncompressed delta as soon as its compressed
The DeltaCache will most likely need to copy the compressed delta
into a new buffer in order to compact away the wasted space at the
end caused by over allocation.  Since we don't need the uncompressed
format anymore, null out our only reference to it so the GC can
reclaim this memory if it needs to perform a collection in order
to satisfy the cache's allocation attempt.

Change-Id: I50403cfd2e3001b093f93a503cccf7adab43cc9d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-16 10:41:09 -07:00
Shawn O. Pearce 6e155d5f41 Merge branch 'js/rename'
* js/rename:
  Implemented file path based tie breaking to exact rename detection
  Added more test cases for RenameDetector
  Added very small optimization to exact rename detection
  Fixed Misleading Javadoc
  Added file path similarity to scoring metric in rename detection
  Fixed potential div by zero bug
  Added file size based rename detection optimization
  Create FileHeader from DiffEntry
  log: Implement --follow
  Cache the diff configuration section
  log: Add whitespace ignore options
  Format submodule links during differences
  Redo DiffFormatter API to be easier to use
  log, diff: Add rename detection support
  Implement similarity based rename detection
  Added a preliminary version of rename detection
  Refactored code out of FileHeader to facilitate rename detection
2010-07-16 10:22:15 -07:00
Shawn O. Pearce 0b46e70155 Fix infinite loop in IndexPack
A programming error using the Inflater API led to an infinite
loop within IndexPack, caused by the Inflater returning 0 from
the inflate() method, but it didn't want more input.  This happens
when it has reached the end of the stream, or has reached a spot
asking for an external dictionary.  Such a case is a failure for us,
and we should abort out.

Thanks to Alex for pointing out that we had 3 implementations of
the inflate rountine, which should be consolidated into one and
use a switch to determine where to load data from.

Bug: 317416
Change-Id: I34120482375b687ea36ed9154002d77047e94b1f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-07-16 10:12:04 -07:00
Jeff Schumacher 31311cacfd Implemented file path based tie breaking to exact rename detection
During the exact rename detection phase in RenameDetector, ties were
resolved on a first-found basis. I added support for file path based
tie breaking during that phase. Basically, there are four situations
that have to be handled:

One add matching one delete:
	In this simple case, we pair them as a rename.

One add matching many deletes:
	Find the delete whos path matches the add the closest, and
	pair them as a rename.

Many adds matching one delete:
	Similar to the above case, we find the add that matches the
	delete the closest, and pair them as a rename. The other adds
	are marked as copies of the delete.

Many adds matching many deletes:
	Build a scoring matrix similar to the one used for content-
	based matching, scoring instead by file path. Some of the
	utility functions in SimilarityRenameDetector are used in
	this case, as we use the same encoding scheme. Once the
	matrix is built, scan it for the best matches, marking them
	as renames. The rest are marked as copies.

I don't particularly like the idea of using utility functions right
out of SimilarityRenameDetector, but it works for the moment. A later
commit will likely refactor this into a common utility class, as well
as bringing exact rename detection out of RenameDetector and into a
separate class, much like SimilarityRenameDetector.

Change-Id: I1fb08390aebdcbf20d049aecf402a36506e55611
2010-07-16 09:56:42 -07:00
Christian Halstrick b840ed0121 Added dirty-detection to WorkingTreeIterator
Added possibility to compare the current entry of a WorkingTreeIterator
to a given DirCacheEntry. This is done to detect whether an entry
in the index is dirty or not. 'Dirty' means that the file in the working tree
is different from what's in the index. Merge algorithms will make use of
this to detect conflicts.

Change-Id: I3ff847f4bf392553dcbd6ee236c6ca32a13eedeb
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-07-16 10:08:52 +02:00
Shawn Pearce 19473b1dbc Merge "Handle the tilde notation (~user) of git url" 2010-07-15 17:29:21 -04:00
Robin Rosenberg 845714158a Handle the tilde notation (~user) of git url
When the path is prefixed with ~ the URI parser thought about this
as /~. Strip the / if the next character is the tilde.

Bug: 307017
Change-Id: I58203e5617956b46d83e8987d1f8042beddffac3
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-07-15 01:16:09 +02:00
Stefan Lay 233e0130b5 Git Porcelain API: Add Command
The new Add command adds files to the Git Index. 
It  uses the DirCache to access the git index. It 
works also in case of an existing conflict. 

Fileglobs (e.g. *.c) are not yet supported. 

The new Add command does add ignored files because
there is no gitignore support in jgit yet.

Bug: 318440
Change-Id: If16fdd4443e46b27361c2a18ed8f51668af5d9ff
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-07-14 11:24:58 +00:00