Commit Graph

633 Commits

Author SHA1 Message Date
Marc Strapetz 6a05904e53 Extend DirCache test case to check "intent to add" flag. 2010-08-31 21:57:10 +02:00
Marc Strapetz 253b36d27a Partial support for index file format "3".
Extended flags are processed and available via DirCacheEntry's
new isSkipWorkTree() and isIntentToAdd() methods.  "resolve-undo"
information is completely ignored since its an optional extension.

Change-Id: Ie6e9c6784c9f265ca3c013c6dc0e6bd29d3b7233
2010-08-31 12:08:09 -07:00
Shawn Pearce b3aa5802b9 Merge "DirCacheEntry: UPDATE_NEEDED should be in-core flag." 2010-08-31 14:29:03 -04:00
Shawn O. Pearce 7d7365548a Add test for RawParseUtils.formatBase10
Change-Id: I3ad3533d03990c9e84186e53b9d755784b2a3758
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-31 11:26:26 -07:00
Marc Strapetz 80f4947e8b Fix RawParseUtils.formatBase10 to work with negative values
Change-Id: Iffa220de76c5e180796fa46c4d67f52a1b3b2e35
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-31 11:20:54 -07:00
Chris Aniszczyk b7465b8fe5 Remove deprecated PersonIdent constructor
Change-Id: I3831de1b6df25a52df30d367f0216573e6ee6b53
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-31 11:55:27 -05:00
Christian Halstrick 0c017188b4 Improve MergeAlgorithm to produce smaller conflicts
The merge algorithm was reporting conflicts which where to big.

Example: The common base was "ABC", the "ours" version contained
"AB1C" (the addition of "1" after pos 2) and the "theirs" version also
contained "AB1C". We have two potentially conflicting edits in the
same region which happen to bring in exactly the same content. This
should not be a conflict - but was previously reported as
"AB<<<1===1>>>C".

This is fixed by checking every conflicting chunk whether the
conflicting regions have a common prefix or suffix and by removing
this regions from the conflict.

Change-Id: I4dc169b8ef7a66ec6b307e9a956feef906c9e15e
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-08-31 17:14:07 +02:00
Marc Strapetz 2eb5426aa9 DirCacheEntry: UPDATE_NEEDED should be in-core flag.
In correspondance to CGit, UPDATE_NEEDED flag should not be
written to disk. Furthermore, it currently intersects CGit's
CE_EXTENDED flag.
2010-08-31 11:25:16 +02:00
Matthias Sohn 51f6fbda1f Fix build of JGit source bundle and feature
Add local changes I missed to push with [1] which broke the JGit
build.

[1] http://egit.eclipse.org/r/#change,1442

Change-Id: I300bfa84c5d8b5128026869b694ef5da7b0d3a4a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-31 02:50:03 +02:00
Christian Halstrick 47f4171315 Let Resolve be the default Merge strategy
the merge command should use by default the "resolve" merge strategy.

Change-Id: I6c6973a3397cca12bd8a6bd950d04b1766a08b4c
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-08-31 01:29:49 +02:00
Christian Halstrick 45e79a526c Added merge strategy RESOLVE
This adds the first merge strategy to JGit which does real
content-merges if necessary. The new merge strategy "resolve" takes as
input three commits: a common base, ours and theirs. It will simply takeover
changes on files which are only touched in ours or theirs. For files
touched in ours and theirs it will try to merge the two contents
knowing taking into account the specified common base.

Rename detection has not been introduced for now.

Change-Id: I49a5ebcdcf4f540f606092c0f1dc66c965dc66ba
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-08-31 01:21:54 +02:00
Chris Aniszczyk 77f79659f5 Merge "Add one more test to ReadTreeTest" 2010-08-30 16:59:15 -04:00
Matthias Sohn fb1c7b136f Wait for JIT optimization before measuring diff performance
On Mac OS X MyerDiffPerformanceTest was failing since during the
first few tests the JIT compiler is running in parallel slowing down
the tests. When setting the JVM option -Xbatch forcing the JIT to do
its work prior to running the code this effect can be avoided. Instead
we chose to run some tests without recording prior to the recorded
tests since relying on -X JVM parameters isn't portable across JVMs.

Use 10k * powers of 2 as sample size instead of odd numbers used
before and also improve formatting of performance readings.

Bug: 323766
Change-Id: I9a46d73f81a785f399d3cf5a90c8c0516526e048
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-30 15:56:52 -05:00
Matthias Sohn e9c5ec5545 Generate source plugin and source feature for jgit
Change-Id: Ibc5a302078bfc01d9ee45a4c0ab0b79b2abd185a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-30 15:20:27 -05:00
Shawn O. Pearce e6bd689d2c Improve LargeObjectException reporting
Use 3 different types of LargeObjectException for the 3 major ways
that we can fail to load an object.  For each of these use a unique
string translation which describes the root cause better than just
the ObjectId.name() does.

Change-Id: I810c98d5691b74af9fc6cbd46fc9879e35a7bdca
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-30 11:53:25 -07:00
Shawn O. Pearce a3945d1bc8 IndexPack: Use byte limited form of getCachedBytes
Currently our algorithm requires that we have the delta base as
a contiguous byte array... but getCachedBytes() might not work
if the object is considered to be large by its underlying loader.
Use the limited form to obtain the object as a byte array instead.

Change-Id: I33f12a8811cb6a4a67396174733f209db8119b42
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-30 11:40:01 -07:00
Shawn O. Pearce 1709800f27 Undo translation of protocol string 'unpack error'
This string is part of the network protocol, and isn't meant to
be translated into another language.  Clients actually scan for
the string "unpack error " off the wire and react magically to
this information.  If it were translated, they would instead have
a protocol exception, which isn't very useful when there is already
an error occurring.

Change-Id: Ia5dc8d36ba65ad2552f683bb637e80b77a7d92f0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-30 10:58:25 -07:00
Christian Halstrick 0bdf73db7f Add one more test to ReadTreeTest
Add an explicit test case to check that we don't
overwrite dirty files in case Head & Index are
equal.

Change-Id: I6266d0a449e55369d2d0a048694dca5565c5fcf3
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-08-30 09:34:17 +02:00
Shawn O. Pearce bc0359c42f Merge "Buffer very large delta streams to reduce explosion of CPU work" 2010-08-28 20:31:24 -04:00
Robin Rosenberg 236899204a Revert "Hide Maven target directories from Eclipse"
This reverts commit db4c516f67 since
it breaks compatibility with Eclipse 3.5 which can no longer import
the projects

Bug: 323390
Change-Id: I3cc91364a6747cfcb4c611a9be5258f81562f726
2010-08-28 09:50:50 +02:00
Shawn O. Pearce b24f907e3e Buffer very large delta streams to reduce explosion of CPU work
Large delta streams are unpacked incrementally, but because a delta
can seek to a random position in the base to perform a copy we may
need to inflate the base repeatedly just to complete one delta.
So work around it by copying the base to a temporary file, and then
we can read from that temporary file using random seeks instead.
Its far more efficient because we now only need to inflate the
base once.

This is still really ugly because we have to dump to a temporary
file, but at least the code can successfully process a large
file without throwing OutOfMemoryError.  If speed is an
issue, the user will need to increase the JVM heap and ensure
core.streamFileThreshold is set to a higher value, so we don't use
this code path as often.

Unfortunately we lose the "optimization" of skipping over portions
of a delta base that we don't actually need in the final result.
This is going to cause us to inflate and write to disk useless
regions that were deleted and do not appear in the final result.
We could later improve on our code by trying to flatten delta
instruction streams before we touch the bottom base object, and
then only store the portions of the base we really need for the
final result and that appear out-of-order.  Since that is some
pretty complex code I'm punting on it for now and just doing this
simple whole-object buffering.

Because the process umask might be permitting other users to read
files we create, we put the temporary buffers into $GIT_DIR/objects.
We can reasonably assume that if a reader can read our temporary
buffer file in that directory, they can also read the base pack
file we are pulling it from and therefore its not a security breach
to expose the inflated content in a file.  This requires a reader
to have write access to the repository, but only if the file is
really big.  I'd rather err on the side of caution here and refuse
to read a very big file into /tmp than to possibly expose a secured
content because the Java 5 JVM won't let us create a protected
temporary file that only the current user can access.

Change-Id: I66fb80b08cbcaf0f65f2db0462c546a495a160dd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-27 13:28:35 -07:00
Chris Aniszczyk f54e883566 Add TagCommand
A tag command is added to the Git porcelain API. Tests were
also added to stress test the tag command.

Change-Id: Iab282a918eb51b0e9c55f628a3396ff01c9eb9eb
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-27 21:11:31 +02:00
Christian Halstrick 2059ed205e Implement a Dircache checkout (needed for merge)
Implementation of a checkout (or 'git read-tree') operation which
works together with DirCache. This implementation does similar things
as WorkDirCheckout which main problem is that it works with deprecated
GitIndex. Since GitIndex doesn't support multiple stages of a file
which is required in merge situations this new implementation is
required to enable merge support.

Change-Id: I13f0f23ad60d98e5168118a7e7e7308e066ecf9c
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-27 16:06:49 +02:00
Christian Halstrick 0e7a38b60f Add getBaseCommit() to Merger
The Merger was was only exposing the merge base as an
AbstractTreeIterator. Since we need the merge base as
RevCommit to generate the merge result I expose it here.

Change-Id: Ibe846370a35ac9bdb0c97ce2e36b2287577fbcad
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-26 18:52:26 -07:00
Matthias Sohn c869f187b7 Increase heap size for jgit tests
Otherwise PackFileTest.testDelta_LargeObjectChain() reproducibly
fails with OutOfMemoryError on Mac OS X 10.6.4.

Change-Id: I6a55ff9ba181102606a0d99ffd52392a1615a422
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-27 00:29:31 +02:00
Matthias Sohn 5817744cf7 Remove unused import
Change-Id: I22f5751720576475e5e1e04110268f6f7fb376b1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-08-26 23:53:41 +02:00
Shawn Pearce 7d9bfa390f Merge "Fix parsing of multiple authors in PersonIdent." 2010-08-26 15:00:58 -04:00
Shawn Pearce 1edbefc5fa Merge "Use JUnit4 for tests" 2010-08-26 14:50:05 -04:00
Chris Aniszczyk d1edd00f56 Run formatter on edited lines via save action
Updates the project level settings to run the formatter
on save on only on the edited lines.

Change-Id: I26dd69d0c95e6d73f9fdf7031f3c1dbf3becbb79
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-26 12:33:09 -05:00
Chris Aniszczyk a005986ce7 Use JUnit4 for tests
We should use JUnit4 for tests. This patch updates
the MANIFEST.MF and respective launch configurations.

Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-26 12:26:38 -05:00
Marc Strapetz 80c622c49c Fix parsing of multiple authors in PersonIdent.
PersonIdent should be parsable for an invalid commit which
contains multiple authors, like "A <a@a.org>, B <b@b.org>".
PersonIdent(String) constructor now delegates to
RawParseUtils.parsePersonIdent().

Change-Id: Ie9798d36d9ecfcc0094ca795f5a44b003136eaf7
2010-08-26 12:58:03 +02:00
Shawn O. Pearce 6517a7c923 Increase temporary buffer for unit test
Because we are using the large stream size, we have to be
above the STREAM_THRESHOLD constant, which I just increased.

Change-Id: I6f10ec8558d9f751d4b547fcae05af94f1c8866b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:53:00 -07:00
Shawn O. Pearce cb0c05b5b4 Increase the default streaming threshold to 15 MiB
Applying deltas in the large streaming mode is horrifically slow.
Trying to pack icu4c is impossible because a single 11 MiB file
sits on top of a 15 MiB file though a 10 deep delta chain, which
results in this very slow inflate process.

Upping the default limit to 15 MiB lets us process this large in a
reasonable time, but its still sufficiently low enough to prevent
exploding the heap of a very large process like Eclipse or Gerrit
Code Review.

We have to revisit the streaming delta application process and do
something much smarter, like flatten the delta chain before we apply
it to the base.  But even that is ugly, I've seen a 155 MiB delta
sitting on top of a 450 MiB file to produce a 300 MiB result object.
If the chain is deep, we may have trouble flatting it down.

Change-Id: If5a0dcbf9d14ea683d75546f104b09bb8cd8fdbb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce 7a9edb3662 Fix reuse from pack file for REF_DELTA types
We miscomputed the CRC32 checksum for a REF_DELTA type of object, by
not including the full 20 byte ObjectId of the delta base in the CRC
code we use when the delta is too large to go through our two faster
small reuse code paths.  This resulted in a corruption error during
packing, where the PackFile erroneously suspected the data was wrong
on the local filesystem and aborted writing, because the CRC didn't
match what we had read from the index.

Change-Id: I7d12cdaeaf2c83ddc11223ce0108d9bd6886e025
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce 3a972f8664 Cleanup and correct resolve Javadoc
We didn't fully cover what we support and what we don't.  It was
also a bit hard to follow the syntaxes supported.  Clean that up
by documenting it.

Change-Id: I7b96fa6cbefcc2364a51f336712ad361ae42df2d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce dbd2d7c83b Support parsing commit:path style blob references
We can now resolve expressions that reference a path within a
commit, designating a specific revision of a specific tree or
file in the project.

Change-Id: Ie6a8be629d264d72209db894bd680c5900035cc0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce 8da17c5046 Support parsing git describe style output
We now match on the -gABBREV style output created by git describe
when its describing a non-tagged commit, and resolve that back to
the full ObjectId using the abbreviation resolution feature that
we already support.

Change-Id: Ib3033f9483d9e1c66c8bb721ff48d4485bcdaef1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce c59e3a493b Rename T0008_testparserev to RepositoryResolveTest
Calling it by the old numerical numbering system makes it really
hard to find the test that tests Repository.resolve(String).

Change-Id: I92d0ecbc8d66ce21bfed08888eeedf1300ffa594
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:13 -07:00
Shawn O. Pearce 401d3b2cc1 Throw AmbiguousObjectException during resolve if its ambiguous
Its wrong to return null if we are resolving an abbreviation and we
have proven it matches more than one object.  We know how to resolve
it if we had more nybbles, as there are two or more objects with the
same prefix.  Declare that to the caller quite clearly by giving them
an AmbiguousObjectException.

Change-Id: I01bb48e587e6d001b93da8575c2c81af3eda5a32
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:12 -07:00
Shawn O. Pearce c44495fa2f Complete an abbreviation when formatting a patch
If we are given a DiffEntry header that already has abbreviated
ObjectIds on it, we may still be able to resolve those locally and
output the difference.  Try to do that through the new resolve API
on ObjectReader.

Change-Id: I0766aa5444b7b8fff73620290f8c9f54adc0be96
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:12 -07:00
Shawn O. Pearce 127a5f95e1 Use limited getCachedBytes in RevWalk
Parsing is rewritten to use the size limited form of getCachedBytes,
thus freeing the revwalk infrastructure from needing to care about
a large object vs. a small object when it gets an ObjectLoader.

Right now we hardcode our upper bound for a commit or annotated
tag to be 15 MiB.  I don't know of any that is more than 1 MiB in
the wild, so going 15x that should give us some reasonable headroom.

Change-Id: If296c211d8b257d76e44908504e71dd9ba70ffa8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-25 17:07:12 -07:00
Shawn O. Pearce c11711f98e Use limited getCachedBytes code to reduce duplication
Rather than duplicating this block everywhere, reuse the limited size
form of getCachedBytes to acquire the content of an object.

Change-Id: I2e26a823e6fd0964d8f8dbfaa0fc2e8834c179c1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-25 19:05:53 -05:00
Shawn O. Pearce 2292655e9e Add brute force byte array loading to ObjectLoader
Some algorithms are coded in a way that requires us to provide them
the entire object contents as a contiguous byte array.  The parsers
in RevCommit and RevTag, or our RawText objects are really good
examples of these.

Instead of duplicating this logic everywhere, lets put it into the
base ObjectLoader type.  That way the caller only needs to give us
their upper size bound, and we'll do the rest of the heavy work to
figure out if the object still fits within that bound, and get them
an array that has the complete contents.

Change-Id: Id95a7f79d2b97e39f6949370ccca2f2c9cfb1a0f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-25 19:03:47 -05:00
Chris Aniszczyk e69dcc703d Merge "Add ObjectId to the LargeObjectException" 2010-08-25 19:54:32 -04:00
Shawn O. Pearce 1f4b48a37c Add ObjectId to the LargeObjectException
A chunk of code that throws LargeObjectException may or may not have
the specific ObjectId on hand when its thrown.  If it does, we want
to cache it in the exception, and put that in the message.  If it is
missing we want to be able to set it later from a higher level stack
frame that does have the object handy.

Change-Id: Ife25546158868bdfa886037e4493ef8235ebe4b9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-08-25 18:54:07 -05:00
Chris Aniszczyk f74d474f3c Merge "Don't copy more than the object size" 2010-08-25 19:52:36 -04:00
Chris Aniszczyk 595a20a064 Merge "Use the ObjectStream size during copyTo" 2010-08-25 19:50:43 -04:00
Benjamin Muskalla 700b8b4514 Fixed typo in DirCache documentation
Change-Id: Ifc2e9047a45d57829fce59c66618e5de9120a5bb
Signed-off-by: Benjamin Muskalla <bmuskalla@eclipsesource.com>
2010-08-25 15:52:18 +02:00
Shawn O. Pearce 7cfe2f12ff Don't copy more than the object size
If the loader's stream is broken and returns to us more content
than it originally declared as the size of the object, don't
copy that onto the output stream.  Instead throw EOFException
and abort fast.  This way we don't follow an infinite stream,
but instead will at least stop when the size was reached.

Change-Id: I7ec0c470c875f03b1f12a74a9b4d2f6e73b659bb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-24 17:37:07 -07:00
Shawn O. Pearce b474de1da3 Use the ObjectStream size during copyTo
If the stream is a delta decompression stream, getting the size
can be expensive.  Its cheaper to get it from the stream itself
rather than from the object loader.

Change-Id: Ia7f0af98681f6d56ea419a48c6fa8eea09274b28
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-24 17:37:07 -07:00