Commit Graph

1107 Commits

Author SHA1 Message Date
Matthias Sohn f2c8eec57b Qualify post 0.11 builds
Change-Id: Ibcef4fc4c986c2cda01e943d16aa1c53eff99f25
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-02-12 03:30:05 +01:00
Matthias Sohn 857d151198 JGit 0.11.1
Change-Id: I9ac2fdfb4326536502964ba614d37d0bd103f524
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-02-11 23:25:34 +01:00
Chris Aniszczyk b46e06bc74 Merge "Fix NPE on reading global config on MAC" into stable-0.11 2011-02-09 12:27:36 -05:00
Jens Baumgart b82e4bf771 Fix NPE on reading global config on MAC
Bug: 336610

Change-Id: Iefcb85e791723801faa315b3ee45fb19e3ca52fb
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2011-02-09 15:12:31 +01:00
Jens Baumgart c9e4a78555 Add isOutdated method to DirCache
isOutdated returns true iff the memory state differs from the index
file.

Change-Id: If35db06743f5f588ab19d360fd2a18a07c918edb
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2011-02-09 15:02:22 +01:00
Mathias Kinzler 724af77c65 PullCommand: use default remote instead of throwing Exception
When pulling into a local branch that has no upstream configuration,
pull should try to used the default remote ("origin") instead of
throwing an Exception.

Bug: 336504
Change-Id: Ife75858e89ea79c0d6d88ba73877fe8400448e34
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-02-08 08:56:19 +01:00
Shawn O. Pearce 0180946bc8 Remove quoting of command over SSH
If the command contains spaces, it needs to be evaluated by the remote
shell.  Quoting the command breaks this, making it impossible to run a
remote command that needs additional options.

Bug: 336301
Change-Id: Ib5d88f0b2151df2d1d2b4e08d51ee979f6da67b5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-06 14:04:39 -08:00
Shawn O. Pearce 0fe7eeba04 UploadPack: Tag non-commits SATISIFIED earlier
This gets non-commits out of the wantSatisfied() main loop by making
use of the cached SATISIFIED flag and its existing bypass.  Anything
that isn't a commit cannot be discovered by the have negotiation, so
its always assumed to be SATISIFIED by the server.

Bug: 301639
Change-Id: I1ef354fbf2e2ed44c9020a4069d7179f2159f19f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-06 01:08:41 -08:00
Shawn O. Pearce b5da75bb87 UploadPack: Don't discard COMMON, SATISIFIED flags
When the walker resets, its going to scrub the COMMON and SATISIFIED
flags off a commit if the commit is contained within another commit
the client wants.  This is common if the client asks for both a
'maint' and 'master' branch, and 'maint' is also fully merged into
'master'.

COMMON shouldn't be scrubbed during reset because its used to control
membership of the commonBase collection, which is a List.  commonBase
should technically be a set, but membership is cheaper with a RevFlag.
COMMON appears on a commit reachable from a WANT when there is also a
PEER_HAS flag present, as this is a merge base.  Scrubbing this off
when another branch is tested isn't useful.

SATISIFIED is a cache to tell us if wantSatisified() has already
completed for this particular WANT.  If it has, there isn't a need to
recompute on that branch.  Scrubbing it off 'maint' when we test
'master' just means we would later need to re-test 'maint', wasting
CPU time on the server.

Bug: 301639
Change-Id: I3bb67d68212e4f579e8c5dfb138f007b406d775f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-06 01:08:41 -08:00
Shawn O. Pearce a35c793b2d UploadPack: Fix want-is-satisfied test
okToGiveUpImp() has been missing a ! for a long time.  This loop over
wantAll() is looking for an object where wantSatisfied() returns
false, because there is no common merge base present.  Unfortunately
it was missing a !, causing the loop to break and return false after
at least one want was satisified.

Bug: 301639
Change-Id: Ifdbe0b22c9cd0a9181546d090b4990d792d70c82
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-06 01:08:41 -08:00
Shawn O. Pearce c6423932bf Fix JGit --upload-pack, --receive-pack options
JGit did not use sh -c to run the receive-pack or upload-pack programs
locally, which caused errors if these strings contained spaces and
needed the local shell to evaluate them.

Win32 support using cmd.exe /c is completely untested, but seems like
it should work based on the limited information I could get through
Google search results.

Bug: 336301
Change-Id: I22e5e3492fdebbae092d1ce6b47ad411e57cc1ba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-05 17:40:54 -08:00
Shawn O. Pearce 2096c749c3 UploadPack: Avoid parsing want list on clone
If a client wants to perform a clone of the repository, it sends
wants, but no haves.  There is no point in parsing the want list
within UploadPack, as there won't be a common merge base search.
Instead just defer the parsing to PackWriter, which will do its
own parsing and object enumeration.

If the client does have a "have" set, defer parsing of the want list
until the have list is also parsed, and parse them together in a
single batch queue.  This lets the underlying storage system use a
larger lookup batch if there is significant latency involved when
resolving an ObjectId to a RevObject.

Change-Id: I9c30d34f8e344da05c8a2c041a6dc181d8e8bc19
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-04 09:10:23 -08:00
Shawn O. Pearce a3620cbbe1 Reuse cached SHA-1 when computing from WorkingTreeIterator
Change-Id: I2b2170c29017993d8cb7a1d3c8cd94fb16c7dd02
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2011-02-03 17:06:23 -06:00
Shawn O. Pearce 461b012e95 PackWriter: Support reuse of entire packs
The most expensive part of packing a repository for transport to
another system is enumerating all of the objects in the repository.
Once this gets to the size of the linux-2.6 repository (1.8 million
objects), enumeration can take several CPU minutes and costs a lot
of temporary working set memory.

Teach PackWriter to efficiently reuse an existing "cached pack"
by answering a clone request with a thin pack followed by a larger
cached pack appended to the end.  This requires the repository
owner to first construct the cached pack by hand, and record the
tip commits inside of $GIT_DIR/objects/info/cached-packs:

  cd $GIT_DIR
  root=$(git rev-parse master)
  tmp=objects/.tmp-$$
  names=$(echo $root | git pack-objects --keep-true-parents --revs $tmp)
  for n in $names; do
    chmod a-w $tmp-$n.pack $tmp-$n.idx
    touch objects/pack/pack-$n.keep
    mv $tmp-$n.pack objects/pack/pack-$n.pack
    mv $tmp-$n.idx objects/pack/pack-$n.idx
  done

  (echo "+ $root";
   for n in $names; do echo "P $n"; done;
   echo) >>objects/info/cached-packs

  git repack -a -d

When a clone request needs to include $root, the corresponding
cached pack will be copied as-is, rather than enumerating all of
the objects that are reachable from $root.

For a linux-2.6 kernel repository that should be about 376 MiB,
the above process creates two packs of 368 MiB and 38 MiB[1].
This is a local disk usage increase of ~26 MiB, due to reduced
delta compression between the large cached pack and the smaller
recent activity pack.  The overhead is similar to 1 full copy of
the compressed project sources.

With this cached pack in hand, JGit daemon completes a clone request
in 1m17s less time, but a slightly larger data transfer (+2.39 MiB):

  Before:
    remote: Counting objects: 1861830, done
    remote: Finding sources: 100% (1861830/1861830)
    remote: Getting sizes: 100% (88243/88243)
    remote: Compressing objects: 100% (88184/88184)
    Receiving objects: 100% (1861830/1861830), 376.01 MiB | 19.01 MiB/s, done.
    remote: Total 1861830 (delta 4706), reused 1851053 (delta 1553844)
    Resolving deltas: 100% (1564621/1564621), done.

    real  3m19.005s

  After:
    remote: Counting objects: 1601, done
    remote: Counting objects: 1828460, done
    remote: Finding sources: 100% (50475/50475)
    remote: Getting sizes: 100% (18843/18843)
    remote: Compressing objects: 100% (7585/7585)
    remote: Total 1861830 (delta 2407), reused 1856197 (delta 37510)
    Receiving objects: 100% (1861830/1861830), 378.40 MiB | 31.31 MiB/s, done.
    Resolving deltas: 100% (1559477/1559477), done.

    real 2m2.938s

Repository owners can periodically refresh their cached packs by
repacking their repository, folding all newer objects into a larger
cached pack.  Since repacking is already considered to be a normal
Git maintenance activity, this isn't a very big burden.

[1] In this test $root was set back about two weeks.

Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-03 13:20:22 -08:00
Shawn O. Pearce 71f168fcd7 PackWriter: Display totals after sending objects
CGit pack-objects displays a totals line after the pack data
was fully written.  This can be useful to understand some of
the decisions made by the packer, and has been a great tool
for helping to debug some of that code.

Track some of the basic values, and send it to the client when
packing is done:

  remote: Counting objects: 1826776, done
  remote: Finding sources: 100% (55121/55121)
  remote: Getting sizes: 100% (25654/25654)
  remote: Compressing objects: 100% (11434/11434)
  remote: Total 1861830 (delta 3926), reused 1854705 (delta 38306)
  Receiving objects: 100% (1861830/1861830), 386.03 MiB | 30.32 MiB/s, done.

Change-Id: If3b039017a984ed5d5ae80940ce32bda93652df5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-02 17:17:57 -08:00
Shawn O. Pearce 04759f3274 RefAdvertiser: Avoid object parsing
It isn't strictly necessary to validate every reference's target
object is reachable in the repository before advertising it to a
client. This is an expensive operation when there are thousands of
references, and its very unlikely that a reference uses a missing
object, because garbage collection proceeds from the references and
walks down through the graph. So trying to hide a dangling reference
from clients is relatively pointless.

Even if we are trying to avoid giving a client a corrupt repository,
this simple check isn't sufficient.  It is possible for a reference to
point to a valid commit, but that commit to have a missing blob in its
root tree.  This can be caused by staging a file into the index,
waiting several weeks, then committing that file while also racing
against a prune.  The prune may delete the blob, since its
modification time is more than 2 weeks ago, but retain the commit,
since its modification time is right now.

Such graph corruption is already caught during PackWriter as it
enumerates the graph from the client's want list and digs back
to the roots or common base.  Leave the reference validation also
for that same phase, where we know we have to parse the object to
support the enumeration.

Change-Id: Iee70ead0d3ed2d2fcc980417d09d7a69b05f5c2f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-02-02 17:16:32 -08:00
Chris Aniszczyk f265a80d2e Merge "Expose some constants needed for reading the Pull configuration" 2011-02-02 10:22:23 -05:00
Mathias Kinzler 13a406287e Expose some constants needed for reading the Pull configuration
Change-Id: I72cb1cc718800c09366306ab2eebd43cd82023ff
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-02-02 14:45:37 +01:00
Jens Baumgart 29ed09a44f PushCommand: do not set a null credentials provider
PushCommand now does not set a null credentials provider on
Transport because in this case the default provider is replaced with
null and the default mechanism for providing credentials is not
working.

Bug: 336023
Change-Id: I7a7a9221afcfebe2e1595a5e59641e6c1ae4a207
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
2011-02-02 13:13:28 +01:00
Robin Stocker b0245b548b Don't print "into HEAD" when merging refs/heads/master
When MergeMessageFormatter was given a symbolic ref HEAD which points to
refs/heads/master (which is the case when merging a branch in EGit), it
would result in a merge message like the following:

  Merge branch 'a' into HEAD

But it should print the following (as C Git does):

  Merge branch 'a'

The solution is to use the leaf ref when checking for refs/heads/master.

Change-Id: I28ae5713b7e8123a0176fc6d7356e469900e7e97
2011-02-01 22:27:33 +01:00
Shawn O. Pearce 13bcf05a9e PackWriter: Make thin packs more efficient
There is no point in pushing all of the files within the edge
commits into the delta search when making a thin pack.  This floods
the delta search window with objects that are unlikely to be useful
bases for the objects that will be written out, resulting in lower
data compression and higher transfer sizes.

Instead observe the path of a tree or blob that is being pushed
into the outgoing set, and use that path to locate up to WINDOW
ancestor versions from the edge commits.  Push only those objects
into the edgeObjects set, reducing the number of objects seen by the
search window.  This allows PackWriter to only look at ancestors
for the modified files, rather than all files in the project.
Limiting the search to WINDOW size makes sense, because more than
WINDOW edge objects will just skip through the window search as
none of them need to be delta compressed.

To further improve compression, sort edge objects into the front
of the window list, rather than randomly throughout.  This puts
non-edges later in the window and gives them a better chance at
finding their base, since they search backwards through the window.

These changes make a significant difference in the thin-pack:

  Before:
    remote: Counting objects: 144190, done
    remote: Finding sources: 100% (50275/50275)
    remote: Getting sizes: 100% (101405/101405)
    remote: Compressing objects: 100% (7587/7587)
    Receiving objects: 100% (50275/50275), 24.67 MiB | 9.90 MiB/s, done.
    Resolving deltas: 100% (40339/40339), completed with 2218 local objects.

    real    0m30.267s

  After:
    remote: Counting objects: 61549, done
    remote: Finding sources: 100% (50275/50275)
    remote: Getting sizes: 100% (18862/18862)
    remote: Compressing objects: 100% (7588/7588)
    Receiving objects: 100% (50275/50275), 11.04 MiB | 3.51 MiB/s, done.
    Resolving deltas: 100% (43160/43160), completed with 5014 local objects.

    real    0m22.170s

The resulting pack is 13.63 MiB smaller, even though it contains the
same exact objects.  82,543 fewer objects had to have their sizes
looked up, which saved about 8s of server CPU time.  2,796 more
objects from the client were used as part of the base object set,
which contributed to the smaller transfer size.

Change-Id: Id01271950432c6960897495b09deab70e33993a9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Sigend-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-02-01 09:12:06 -06:00
Shawn O. Pearce 2fbcba41e3 PackWriter: Cleanup findObjectToPack method
Some of this code predates making ObjectId.equals() final
and fixing RevObject.equals() to match ObjectId.equals().
It was therefore more complex than it needs to be, because
it tried to work around RevObject's broken equals() rules
by converting to ObjectId in a different collection.

Also combine setUpWalker() and findObjectsToPack() methods,
these can be one method and the code is actually cleaner.

Change-Id: I0f4cf9997cd66d8b6e7f80873979ef1439e507fe
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-02-01 09:03:24 -06:00
Shawn O. Pearce 8f63dface2 PackWriter: Correct 'Compressing objects' progress message
The first 'Compressing objects' progress message is wrong, its
actually PackWriter looking up the sizes of each object in the
ObjectDatabase, so objects can be sorted correctly in the later
type-size sort that tries to take advantage of "Linus' Law" to
improve delta compression.

Rename the progress to say 'Getting sizes', which is an accurate
description of what it is doing.

Change-Id: Ida0a052ad2f6e994996189ca12959caab9e556a3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-02-01 09:01:58 -06:00
Chris Aniszczyk eb5658e629 Merge "Add git-clone to the Git API" 2011-02-01 09:56:46 -05:00
Shawn O. Pearce 37a10e3006 PackWriter: Don't include edges in progress meter
When compressing objects, don't include the edges in the progress
meter.  These cost almost no CPU time as they are simply pushed into
and popped out of the delta search window.

Change-Id: I7ea19f0263e463c65da34a7e92718c6db1d4a131
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-02-01 08:55:43 -06:00
Chris Aniszczyk cc5295c4b4 Merge "Show resolving deltas progress to push clients" 2011-02-01 09:40:57 -05:00
Chris Aniszczyk c1de63262e Merge "ObjectWalk: Fix reset for non-commit objects" 2011-02-01 09:38:30 -05:00
Chris Aniszczyk 4112884ede Add git-clone to the Git API
Enhance the Git API to support cloning repositories.

Bug: 334763
Change-Id: Ibe1191498dceb9cbd1325aed85b4c403db19f41e
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-31 16:56:56 -06:00
Shawn O. Pearce 168114fd39 Show resolving deltas progress to push clients
CGit push clients 1.6.6 and later support progress messages on the
side-band-64k channel during push, as this was introduced to handle
server side hook errors reported over smart HTTP.

Since JGit's delta resolution isn't always as fast as CGit's is,
a user may think the server has crashed and failed to report
status if the user pushed a lot of content and sees no feedback.
Exposing the progress monitor during the resolving deltas phase
will let the user know the server is still making forward progress.

This also helps BasePackPushConnection, which has a bounded timeout
on how long it will wait before assuming the remote server is dead.
Progress messages pushed down the side-band channel will reset the
read timer, helping the connection to stay alive and avoid timing
out before the remote side's work is complete.

Change-Id: I429c825e5a724d2f21c66f95526d9c49edcc6ca9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-31 12:31:52 -08:00
Shawn O. Pearce c2ab3421a2 ObjectWalk: Fix reset for non-commit objects
Non-commits are added to a pending queue, but duplicates are
removed by checking a flag.  During a reset that flag must be
stripped off the old roots, otherwise the caller cannot reuse
the old roots after the reset.

RevWalk already does this correctly for commits, but ObjectWalk
failed to handle the non-commit case itself.

Change-Id: I99e1832bf204eac5a424fdb04f327792e8cded4a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-31 12:31:52 -08:00
Mathias Kinzler b15b9d5df2 Proper handling of rebase during pull
After consulting with Christian Halstrick, it turned out that the
handling of rebase during pull was implemented incorrectly.

Change-Id: I40f03409e080cdfeceb21460150f5e02a016e7f4
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-01-31 12:12:48 +01:00
Robin Rosenberg 9ffcf2a8b3 Merge changes I3a74cc84,I219f864f
* changes:
  [findbugs] Do not ignore exceptional return value of createNewFile()
  Do not create files to be updated before checkout of DirCache entry
2011-01-29 17:52:12 -05:00
Tomasz Zarna 9fbda22392 Add setCredentialsProvider to PullCommand
Bug: 335703
Change-Id: Id9713a4849c772e030fca23dd64b993264f28366
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-28 14:06:47 -06:00
Chris Aniszczyk a880233d7f Merge "ObjectIdSubclassMap: Support duplicate additions" 2011-01-28 12:45:39 -05:00
Shawn O. Pearce 17dc6bdafd ObjectIdSubclassMap: Support duplicate additions
The new addIfAbsent() method combines get() with add(), but does
it in a single step so that the common case of get() returning null
for a new object can immediately insert the object into the map.

Change-Id: Ib599ab4de13ad67665ccfccf3ece52ba3222bcba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-28 08:17:20 -08:00
Chris Aniszczyk bf69401fee Merge "Make PullCommand work with Rebase" 2011-01-28 10:52:38 -05:00
Chris Aniszczyk 0b2ac1e929 Merge "RebaseCommand: detect and handle fast-forward properly" 2011-01-28 10:38:27 -05:00
Shawn O. Pearce 065a0a8122 Revert "Teach PackWriter how to reuse an existing object list"
This reverts commit f5fe2dca3c.

I regret adding this feature to the public API.  Caches aren't always
the best idea, as they require work to maintain.  Here the cache is
redundant information that must be computed, and when it grows stale
must be removed.  The redundant information takes up more disk space,
about the same size as the pack-*.idx files are.  For the linux-2.6
repository, that's more than 40 MB for a 400 MB repository.  So the
cache is a 10% increase in disk usage.

The entire point of this cache is to improve PackWriter performance,
and only PackWriter performance, and only when sending an initial
clone to a new client.  There may be better ways to optimize this, and
until we have a solid solution, we shouldn't be using a separate cache
in JGit.
2011-01-28 07:20:26 -08:00
Mathias Kinzler 14ca80bc90 Make PullCommand work with Rebase
Rebase must honor the upstream configuration

branch.<branchname>.rebase

Change-Id: Ic94f263d3f47b630ad75bd5412cb4741bb1109ca
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-01-28 15:04:52 +01:00
Mathias Kinzler e8a1328d05 RebaseCommand: detect and handle fast-forward properly
This bug was hidden by an incomplete test: the current Rebase
implementation using the "git rebase -i" pattern does not work
correctly if fast-forwarding is involved. The reason for this is that
the log command does not return any commits in this case.
In addition, a check for already merged commits was introduced to
avoid spurious conflicts.

Change-Id: Ib9898fe0f982fa08e41f1dca9452c43de715fdb6
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-01-28 15:03:02 +01:00
Mathias Kinzler c544e96a4c TransportHttp wrongly uses JDK 6 constructor of IOException
IOException constructor taking Exception as parameter is
new for JDK 6.

Change-Id: Iec349fc7be9e9fbaeb53841894883c47a98a7b29
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-01-28 09:24:20 +01:00
Matthias Sohn 38eec8f4a2 [findbugs] Do not ignore exceptional return value of mkdir
java.io.File.mkdir() and mkdirs() report failure as an exceptional
return value false. Fix the code which silently ignored this
exceptional return value.

Change-Id: I41244f4b9d66176e68e2c07e2329cf08492f8619
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-01-28 01:11:12 +01:00
Matthias Sohn 9ec97688b9 Do not create files to be updated before checkout of DirCache entry
DirCacheCheckout.checkoutEntry() prepares the new file content using a
temporary file and then renames it to the file to be written during
checkout. For files to be updated checkout() created each file before
calling checkoutEntry(). Hence renaming the temporary file always
failed which was corrected in exception handling by retrying to rename
the file after deleting the just newly created file.

Change-Id: I219f864f2ed8d68051d7b5955d0659964fa27274
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-01-28 01:11:12 +01:00
Shawn O. Pearce f5fe2dca3c Teach PackWriter how to reuse an existing object list
Counting the objects needed for packing is the most expensive part of
an UploadPack request that has no uninteresting objects (otherwise
known as an initial clone).  During this phase the PackWriter is
enumerating the entire set of objects in this repository, so they can
be sent to the client for their new clone.

Allow the ObjectReader (and therefore the underlying storage system)
to keep a cached list of all reachable objects from a small number of
points in the project's history.  If one of those points is reached
during enumeration of the commit graph, most objects are obtained from
the cached list instead of direct traversal.

PackWriter uses the list by discarding the current object lists and
restarting a traversal from all refs but marking the object list name
as uninteresting.  This allows PackWriter to enumerate all objects
that are more recent than the list creation, or that were on side
branches that the list does not include.

However, ObjectWalk tags all of the trees and commits within the list
commit as UNINTERESTING, which would normally cause PackWriter to
construct a thin pack that excludes these objects.  To avoid that,
addObject() was refactored to allow this list-based enumeration to
always include an object, even if it has been tagged UNINTERESTING by
the ObjectWalk.  This implies the list-based enumeration may only be
used for initial clones, where all objects are being sent.

The UNINTERESTING labeling occurs because StartGenerator always
enables the BoundaryGenerator if the walker is an ObjectWalk and a
commit was marked UNINTERESTING, even if RevSort.BOUNDARY was not
enabled.  This is the default reasonable behavior for an ObjectWalk,
but isn't desired here in PackWriter with the list-based enumeration.
Rather than trying to change all of this behavior, PackWriter works
around it.

Because the list name commit's immediate files and trees were all
enumerated before the list enumeration itself starts (and are also
within the list itself) PackWriter runs the risk of adding the same
objects to its ObjectIdSubclassMap twice.  Since this breaks the
internal map data structure (and also may cause the object to transmit
twice), PackWriter needs to use a new "added" RevFlag to track whether
or not an object has been put into the outgoing list yet.

Change-Id: Ie99ed4d969a6bb20cc2528ac6b8fb91043cee071
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-27 09:38:19 -08:00
Shawn O. Pearce a017fdf112 Allow ObjectReuseAsIs to resort objects during writing
It can be very handy for the implementation to resort the
object list based on data locality, improving prefetch in
the operating system's buffer cache.

Export the list to the implementation was a proper List,
and document that its mutable and OK to be modified.  The
only caller in PackWriter is already OK with these rules.

Change-Id: I3f51cf4388898917b2be36670587a5aee902ff10
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-27 08:58:55 -08:00
Shawn O. Pearce c218a0760d PackWriter: Use TOPO order only for incremental packs
When performing an initial clone of a repository there are no
uninteresting commits, and the resulting pack will be completely
self-contained.  Therefore PackWriter does not need to honor C
Git standard TOPO ordering as described in JGit commit ba984ba2e0
("Fix checkReferencedIsReachable to use correct base list").

Switching to COMMIT_TIME_DESC when there are no uninteresting commits
allows the "Counting objects" phase to emit progress earlier, as the
RevWalk will not buffer the commit list.  When TOPO is set the RevWalk
enumerates all commits first, before outputing any for PackWriter to
mark progress updates from.

Change-Id: If2b6a9903b536c7fb3c45f85d0a67ff6c6e66f22
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-27 08:58:44 -08:00
Shawn O. Pearce 559c4661c3 Remove getObjectsDirectory, openPack from base API
These two methods are specific to the FileRepository implementation
and should not be exposed as part of the base Repository API.  Now
that PackParser is generic and does not require these two methods
to import a pack stream into a repostiory, it is safe to remove
these and get them out of the public view.

Change-Id: I8990004d08074657f467849dabfdaa7e6674e69a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-27 08:56:35 -08:00
Shawn Pearce cc983454c0 Merge "Support for self signed certificate (HTTPS)" 2011-01-27 11:46:49 -05:00
Matthias Sohn 91af19de56 Hard reset should not report conflict on untracked file
This problem surfaced since EGit Core ResetOperationTest is failing
since change I26806d21. JGit detected checkout conflict for untracked
files which never were tracked by the repository. 

"git reset --hard" in c git also doesn't remove such untracked files.

Change-Id: Icc8e1c548ecf6ed48bd2979c81eeb6f578d347bd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-01-27 17:20:04 +01:00
Roberto Tyley afa7c7ab07 Rename PlotWalk.getTags() to getRefs()
Change-Id: I170685e70d9ac09a010df69d26ec1c38bde60174
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-26 22:35:41 -06:00
Roberto Tyley 6ac8279ae7 Provide access to the Refs of a PlotCommit
This information is generally useful - have followed the
accessor pattern of 'children' and 'parents'

Change-Id: I79b3ddd6f390152aa49e6b7a4c72a4aca0d6bc72
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-26 22:03:41 -06:00
Robin Rosenberg 24e7f0f6fa Fix tests broken by fix for adding files in a network share
The change Ie0350e032a97e0d09626d6143c5c692873a5f6a2 was not
done properly. The renamed file was not write protected, and
this broke a test.

Bug: 335388
Change-Id: I41b2235b7677bc5fddc70dda2a56cdd2cb53ce5d
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-01-26 13:52:58 -06:00
Mathias Kinzler a5b36ae1ea FetchCommand: allow to set "TagOpt"
This is needed for implementing Fetch in EGit using the API.

Change-Id: Ibdcc95906ef0f93e3798ae20d4de353fb394f2e2
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-01-26 20:03:02 +01:00
Christian Halstrick 0d7dd6625a Make sure not to overwrite untracked not-ignored files
When DirCacheCheckout was checking out it was silently
overwriting untracked files. This is only ok if the
files are also ignored. Untracked and not ignored files
should not be overwritten. This fix adds checks for
this situation.
Because this change in the behaviour also broke tests
which expected that a checkout will overwrite untracked
files (PullCommandTest) these tests have to be modified
also.

Bug: 333093
Change-Id: I26806d2108ceb64c51abaa877e11b584bf527fc9
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-26 11:41:44 -06:00
Robin Rosenberg c4c8d80fd3 Fix adding files in a network share
We cannot always rename read-only files on network shares,
so rename the temp file for a new loose object first, and
then set it as read-only.

Bug: 335388
Change-Id: Ie0350e032a97e0d09626d6143c5c692873a5f6a2
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-26 11:26:55 -06:00
Chris Aniszczyk 509662653b Merge "Refactor and comment complicated if statements" 2011-01-26 12:23:26 -05:00
Chris Aniszczyk 9b8ac0151e Merge "MergeCommand should create missing branches" 2011-01-26 12:17:47 -05:00
Mathias Kinzler 414e0cd329 Make setCredentialsProvider more convenient to use
Change-Id: I984836ea7d6a67fd2d1d05f270afa7c29f30971c
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-01-26 18:03:22 +01:00
Christian Halstrick 4cba86bfea Refactor and comment complicated if statements
When debugging and enhancing DirCacheCheckout.processEntry() I found
that some of if-statements where hard to read/understand. This
change just splits some long if statements and adds more comments
explaining in which state we are. This change is only a preparation
for followup commits which introduce checks for untracked+ignored
files.

Change-Id: I670ff08310b72c858709b9e395f0aebb4b290a56
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2011-01-26 17:17:45 +01:00
Christian Halstrick 85f69c286b MergeCommand should create missing branches
If HEAD exists but points to an not-existing branch the merge
command should silently create the missing branch and check
it out. This happens if you pull into freshly initalized repo.
HEAD points to refs/heads/master but refs/heads/master doesn't
exist. If you know merge a commit X into HEAD then the branch
master should be created (pointing to X) the working tree should
be updated to reflect X. That is achieved by checkout with one
tree only (HEAD is missing).

A test for this functionality will come the the next proposal
in PullCommandTest.

Change-Id: Id4a0d56d944e0acebd4b3157428bb50bd3fdd872
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2011-01-26 17:17:44 +01:00
Per Salomonsson d49530ad86 Support for self signed certificate (HTTPS)
Add possibility to disable ssl verification, just as i can do with git
using: git config --global http.sslVerify false

To enable the feature, configure
Window->Preferences->Team->Git->Configuration
and add a new key/value: http.sslVerify=false

When handling repos over https, JGit will then check that flag to see
if security is loose and the ssl verification should be ignored.

Having it implemented as a key/value makes it not too obvious in the
GUI - so the user must know what he/she is doing when adding it. Being
aware of the risks etc.

Bug: 332487
Change-Id: I2a1b8098b5890bf512b8dbe07da41036c0fc9b72
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-01-26 01:17:01 +01:00
Shawn O. Pearce 36e396f8b9 Permit disabling birthday attack checks in PackParser
Reading a repository for millions of missing objects might be very
expensive to perform, especially if the repository is on a network
filesystem or some other costly RPC backend.  A repository owner
might choose to accept some risk in return for better performance,
so allow disabling collision checking when receiving a pack.

Currently there is no way for an end-user to disable this feature.
This is intentional, because it is generally *NOT* a good idea to
skip this check.  Instead this feature is supplied for storage
implementations to bypass the default checking logic, should they
have their own custom routines that is just as effective but can
be handled more efficiently.

Change-Id: I90c801bb40e86412209de0c43e294a28f6a767a5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-25 17:14:23 -06:00
Shawn O. Pearce d62350d907 Ensure all deltas were resolved in a pack
If a pack uses OFS_DELTA only (e.g. its an initial push to a
repository) and PackParser's implementation is broken such that the
delta chain that hangs below a particular object offset is empty, the
entryCount won't match the expected objectCount. Fail fast rather
than claiming the stream was parsed correctly.

The current implementation is not broken as described above.  I broke
the code when I implemented my own new subclass of PackParser (which
incorrectly mucked with the object offset information), leading me to
discover this consistency check was missing.

Change-Id: I07540f0ae1144ef6f3bda48774dbdefb8876e1d3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-25 16:52:22 -06:00
Shawn O. Pearce 1bf0c3cdb1 Refactor IndexPack to not require local filesystem
By moving the logic that parses a pack stream from the network (or
a bundle) into a type that can be constructed by an ObjectInserter,
repository implementations have a chance to inject their own logic
for storing object data received into the destination repository.

The API isn't completely generic yet, there are still quite a few
assumptions that the PackParser subclass is storing the data onto
the local filesystem as a single file.  But its about the simplest
split of IndexPack I can come up with without completely ripping
the code apart.

Change-Id: I5b167c9cc6d7a7c56d0197c62c0fd0036a83ec6c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-25 16:43:06 -06:00
Jesse Greenwald 51dedfdc31 Parse RevCommit bodies before calling RevFilter.include()
RevFilter.include()'s documentation promises the RevCommit's
body is parsed before include is invoked.  This wasn't always
true if the commit was parsed once, had its body discarded,
the RevWalk was reset() and started a new traversal.

Change-Id: Ie5cafde09ae870712b165d8a97a2c9daf90b1dbd
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-25 16:39:00 -06:00
Mathias Kinzler 920ac08777 Allow to set a CredentialsProvider on relevant API commands
This is needed for commands that use Transport internally.

Change-Id: I9417c85255b160723968c647063b9c7e05995ea4
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-25 16:36:10 -06:00
Sasa Zivkov 832d3b8384 Exposed the constructor of Note class
Additionally, defined the NoteMap.getNote method which returns a Note
instance.  These changes were necessary to enable implementation of
the NoteMerger interface (the merge method needs to instantiate a
Note) and to enable direct use of NoteMerger which expects instances
of Note class as its paramters.  Implementing creation of code review
summary notes in Gerrit [1] will make use of both of these features.

[1] https://review.source.android.com/#change,20045

Change-Id: I627aefcedcd3434deecd63fa1d3e90e303b385ac
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-25 16:33:29 -06:00
Christian Halstrick c62882191f Introduce metaData compare between working tree and index entries
Instead of offering only a high-level isModified() method a new
method compareMetadata() is introduced which compares a working tree entry
and a index entry by looking at metadata only. Some use-cases
(e.g. computing the content-id in idBuffer()) may use this new method
instead of isModified().

Change-Id: I4de7501d159889fbac5ae6951f4fef8340461b47
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-01-21 09:23:06 -06:00
Robin Rosenberg 5e2e3819a6 Add progress reporting to IndexDiff
Change-Id: I4f05bdb0c58b039bd379341a6093f06a2cdfec6e
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-01-21 01:28:54 +01:00
Robin Rosenberg e43887b69e Fix misc spelling errors in comments and method names
Change-Id: I24552443710075856540696717ac4068dfe6a7f2
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-01-17 22:04:14 +01:00
Matthias Sohn de1d057d72 Merge "File utility for creating a new empty file" 2011-01-16 12:22:46 -05:00
Matthias Sohn c45f2aec56 File utility for creating a new empty file
The java.io.File.createNewFile() method for creating new empty files
reports failure by returning false. To ease proper checking of return
values provide a utility method wrapping createNewFile() throwing
IOException on failure.

Change-Id: I42a3dc9d8ff70af62e84de396e6a740050afa896
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-01-14 17:28:14 +01:00
Shawn Pearce 67e176e529 Merge "ConfigConstants: expose some constants for user name and email." 2011-01-12 10:47:55 -05:00
Shawn Pearce 600f624a35 Merge "CheckoutCommand: fix reflog message" 2011-01-12 10:46:23 -05:00
Shawn Pearce 11f2b849a3 Merge "Locate $HOME like C Git does on Windows" 2011-01-12 10:44:57 -05:00
Roberto Tyley 944fcdae66 Fix API ListBranchCommand for listmode 'all'
If remote branches are present they can not be added
to the RefMap from the local branches - the two RefMaps
have a different value of 'prefix' and consequently an
IllegalArgumentException is thrown.
2011-01-12 14:34:10 +00:00
Robin Rosenberg 0fd9676771 Locate $HOME like C Git does on Windows
Java's user.home is not the same as $HOME so EGit did see the
same global configuration as C Git does.

Bug: 333269
Change-Id: Id54fc5292bf8c5a67177f9097ee692717a7df336
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-01-12 14:58:55 +01:00
Mathias Kinzler 7047d2fa8d CheckoutCommand: fix reflog message
There is a space missing between <from> and "to" in the reflog
message produced by the CheckoutCommand, which is of the form

moving from <from> to <to>

Change-Id: I3dc57ab0a6589292db77a17d9029ee9499dfc725
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-01-12 13:19:32 +01:00
Mathias Kinzler 5ebfdc8091 ConfigConstants: expose some constants for user name and email.
This is needed by a EGit change

http://egit.eclipse.org/r/#change,2232

Change-Id: I3d62f904b769fc2f1b7b8f0f24f7dd757fc9c379
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2011-01-12 08:50:12 +01:00
Matthias Sohn 838fdb342b Merge "Do not cherry-pick or revert commit more than once" 2011-01-10 09:00:30 -05:00
Robin Rosenberg 2058f9272b Do not cherry-pick or revert commit more than once
Instead just return success. In the case that no commit has been
cherry-picked or reverted, just return the old HEAD.
    
Bug: 333814
Change-Id: I67db2b77b52c43932436d22a8daa5a6556423484
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-01-10 08:47:14 +01:00
Shawn O. Pearce 05ca0c49f9 Merge "Use heap based stack for PackFile deltas" 2011-01-09 19:20:13 -05:00
Shawn O. Pearce 680869d779 Merge "Config: Preserve existing case of names in sections" 2011-01-09 19:19:08 -05:00
Sasa Zivkov 1993cf8a27 Merging Git notes
Merging Git notes branches has several differences from merging "normal"
branches. Although Git notes are initially stored as one flat tree the
tree may fanout when the number of notes becomes too large for efficient
access. In this case the first two hex digits of the note name will be
used as a subdirectory name and the rest 38 hex digits as the file name
under that directory. Similarly, when number of notes decreases a fanout
tree may collapse back into a flat tree. The Git notes merge algorithm
must take into account possibly different tree structures in different
note branches and must properly match them against each other.

Any conflict on a Git note is, by default, resolved by concatenating
the two conflicting versions of the note. A delete-edit conflict is, by
default, resolved by keeping the edit version.

The note merge logic is pluggable and the caller may provide custom
note merger that will perform different merging strategy.

Additionally, it is possible to have non-note entries inside a notes
tree. The merge algorithm must also take this fact into account and
will try to merge such non-note entries. However, in case of any merge
conflicts the merge operation will fail. Git notes merge algorithm is
currently not trying to do content merge of non-note entries.

Thanks to Shawn Pearce for patiently answering my questions related to
this topic, giving hints and providing code snippets.

Change-Id: I3b2335c76c766fd7ea25752e54087f9b19d69c88
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-01-09 00:27:56 +01:00
Marc Strapetz c87ae94c70 Fix IgnoreRule for directory-only patterns
Patterns containing only a trailing slash have to be treated
as "global" patterns. For example: "classes/" matches "classes"
as well as "dir/classes" directory.
2011-01-07 12:53:14 +01:00
Shawn O. Pearce b2d528887c Config: Preserve existing case of names in sections
When an application asks for the names in a section, it may want to
see the existing case that was stored by the user.  For example,
Gerrit Code Review wants to store a configuration block like:

  [access "refs/heads/master"]
    label-Code-Review = group Developers

and although the name label-Code-Review is case-insensitive, it wants
to display the case as it appeared in the configuration file.

When enumerating section names or variable names (both of which are
case-insensitive), Config now keeps track of the string that first
appeared, and presents them in file order, permitting applications to
use this information.  To maintain case-insensitive behavior, the
contains() method of the returned Set<String> still performs a
case-insensitive compare.

This is a behavior change if the caller enumerates the returned
Set<String> and copies it to his own Set<String>, and then performs
contains() tests against that, as the strings are now the original
case from the configuration block.  But I don't think anyone actually
does this, as the returned sets are immutable and are cached.

Change-Id: Ie4e060ef7772958b2062679e462c34c506371740
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-06 11:13:45 -08:00
Shawn O. Pearce 165358bc99 Use heap based stack for PackFile deltas
Instead of using the current thread's stack to recurse through the
delta chain, use a linked list that is stored in the heap.  This
permits the any thread to load a deep delta chain without running out
of thread stack space.

Despite needing to allocate a stack entry object for each delta
visited along the chain being loaded, the object allocation count is
kept the same as in the prior version by removing the transient
ObjectLoaders from the intermediate objects accessed in the chain.
Instead the byte[] for the raw data is passed, and null is used as a
magic value to signal isLarge() and enter the large object code path.

Like the old version, this implementation minimizes the amount of
memory that must be live at once.  The current delta instruction
sequence, the base it applies onto, and the result are the only live
data arrays.  As each level is processed, the prior base is discarded
and replaced with the new result.

Each Delta frame on the stack is slightly larger than the standard
ObjectLoader.SmallObject type that was used before, however the Delta
instances should be smaller than the old method stack frames, so total
memory usage should actually be lower with this new implementation.

Change-Id: I6faca2a440020309658ca23fbec4c95aa637051c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-06 09:48:43 -08:00
Sasa Zivkov 7cd812940d NoteMap implements Iterable<Note>
We will need to iterate over all notes of a NoteMap, at least this will be
needed for testing purposes. This change also implied making the Note class
public.

Change-Id: I9b0639f9843f457ee9de43504b2499a673cd0e77
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
2011-01-05 08:24:13 +01:00
Robin Rosenberg b3e59bd9d6 Implement a revert command
This is almost reverted cherry-pick, and the implementation is
almost identical. It orders the input to merge differently to get
the effect and produces a different commit message with the
default author, rather than the original author.

Change-Id: I39970091d9f7406ae7168b8efaab23a5e2c16bad
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-01-02 22:15:07 +01:00
Shawn Pearce 9a3ce780fc Merge "[findbugs] Make CheckoutResult constants final" 2010-12-31 17:08:23 -05:00
Robin Rosenberg d9e07a574a Convert all JGit unit tests to JUnit 4
Eclipse has some problem re-running single JUnit tests if
the tests are in Junit 3 format, but the JUnit 4 launcher
is used. This was quite unnecessary and the move was not
completed. We still have no JUnit4 test.

This completes the extermination of JUnit3. Most of the
work was global searce/replace using regular expression,
followed by numerous invocarions of quick-fix and organize
imports and verification that we had the same number of
tests before and after.

- Annotations were introduced.
- All references to JUnit3 classes removed
- Half-good replacement for getting the test name. This was
  needed to make the TestRngs work. The initialization of
  TestRngs was also made lazily since we can not longer find
  out the test name in runtime in the @Before methods.
- Renamed test classes to end with Test, with the exception
  of TestTranslateBundle, which fails from Maven
- Moved JGitTestUtil to the junit support bundle

Change-Id: Iddcd3da6ca927a7be773a9c63ebf8bb2147e2d13
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-31 14:00:05 -08:00
Shawn Pearce 7cf8b8812f Merge "Add support for getting the system wide configuration" 2010-12-31 16:13:33 -05:00
Robin Rosenberg 797ebba307 Add support for getting the system wide configuration
These settings are stored in <prefix>/etc/gitconfig. The C Git
binary is installed in <prefix>/bin, so we look for the C Git
executable to find this location, first by looking at the PATH
environment variable and then by attemting to launch bash as
a login shell to find out.

Bug: 333216
Change-Id: I1bbee9fb123a81714a34a9cc242b92beacfbb4a8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-12-31 11:48:34 +01:00
Shawn Pearce 4da775eaff Merge "IndexPack: Use stack-based recursion for delta resolution" 2010-12-30 19:52:47 -05:00
roberto b5f0a7d7ff IndexPack: Use stack-based recursion for delta resolution
Replace 'method' with 'heap'-based recursion for resolving deltas.

Git packfile delta-chain depth can exceed 50 levels in certain files
(the packfile of the JGit project itself has >800 objects with
chain-length >50). Using method-based recursion on such packfiles will
quickly throw a StackOverflowError on VMs with constrained stack.

Benefits:

* packfile delta-resolution no longer limited by the maximum number
  of stack frames permitted on the current thread.

* slight performance improvement
  (3% speed increase on the packfile of the JGit project)

Change-Id: I1d9b3a8ba3c6d874d83cb93ebf171c6ab193e6cc
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-30 16:49:24 -08:00
Matthias Sohn 65ccadeced [findbugs] Make CheckoutResult constants final
Change-Id: I9117f212e2ad7051fdc6e7417ebc7c2d15b357a8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-30 23:04:43 +01:00
Robin Rosenberg 240769e023 Refactor exec of a command and reading one line into utility
Change-Id: Ia9e5afe7f29c3e5e74b8d226441ed429fb229c82
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-12-30 12:41:22 -08:00
Robin Rosenberg 14b358a6fb Refactor search for a file within a PATH
Change-Id: I785ab6bf1823d174394b1d2b25c5bb202535e943
2010-12-30 12:38:18 -08:00
Shawn Pearce 7a1bd7adb1 Merge "Fix FileSnapShot" 2010-12-30 15:31:06 -05:00
Robin Rosenberg c3f52c62a8 Fix FileSnapShot
We cannot use SystemReader to get the time, unless we do that consistently,
which is harder to do and be sure we are really testing what we want.

Then we need to update our lastRead variable whenever we conclude that
our file is not racily clean according to lastRead. It may well be clean,
but we do not know that until we check the system clock again.

Finally add a test for this class.

Change-Id: I1894b032b9bd359d1b5325e5472d48e372599e4c
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-12-30 01:15:59 +01:00
Shawn Pearce 4170913b1b Merge "CheckoutResult: return paths instead of Files" 2010-12-29 14:29:49 -05:00
Shawn O. Pearce 6533994bc9 Fix ArrayIndexOutOfBoundsException in DirCacheIterator
If the 'TREE' extension contains an invalid subtree that has
been removed, DirCacheIterator still tried to access it due to
an invalid childCnt field within the parent DirCacheTree object.
This is easy for a user to do, they just need to move all files
out of a subdirectory.

For example, the input for the JUnit test case for this bug was
built using the following C Git sequence:

  mkdir -p a/b
  touch a/b/c q
  git add a/b/c q
  git write-tree
  git mv a/b/c a/a

After the last step, the subdirectory a/b is empty, as its only
file was moved into the parent directory.  Because of the earlier
`git write-tree` operation, there is a 'TREE' extension present, but
the a and a/b subdirectories have been marked invalid by the rename.

When JGit tried to iterate over the a tree, it tried to correct
childCnt to be zero as a/b no longer exists, but it failed to
update childCnt.

Change-Id: I7a0f78fc48a36b1a83252d354618f6807fca0426
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-22 14:11:22 -08:00
Shawn O. Pearce 0cd76ab65d Correct GIT_INDEX_FILE environment variable
This is GIT_INDEX_FILE, not GIT_INDEX.

Change-Id: Ib3af28ba196f74c8cb4d318b57ea346bb90f9a1e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-22 11:26:33 -08:00
Mathias Kinzler e272ca0f14 CheckoutResult: return paths instead of Files
As discussed in

http://egit.eclipse.org/r/#change,2127

we should use paths relative the working directory instead of Files to
notify the caller about conflicts and nondeleted files.

Change-Id: I034c7bd846f0df78d97bc246f38d411f29713dde
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-12-21 10:06:19 +01:00
Chris Aniszczyk 8f419dc5e6 Merge "FileBasedConfig: Use FileSnapshot for isOutdated()" 2010-12-20 12:06:00 -05:00
Mathias Kinzler 89a4dcf71f Checkout: fix handling if name does not refer to a local branch
The CheckoutCommand does not handle names other than local branch
names properly; it must detach HEAD if such a name is encountered (for
example a commit ID or a remote tracking branch).

Change-Id: I5d55177f4029bcc34fc2649fd564b125a2929cc4
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-20 09:30:40 -06:00
Chris Aniszczyk a92bda5adf Merge "Extract pack directory last modified check code" 2010-12-20 10:27:33 -05:00
Mathias Kinzler 645d262de6 Checkout: expose a CheckoutResult
This is needed by callers to determine checkout conflicts and
possible files that were not deleted during the checkout so that they
can present the end user with a better Exception description and retry
to delete the undeleted files later, respectively.

Change-Id: I037930da7b1a4dfb24cfa3205afb51dc29e4a5b8
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-12-20 10:21:49 +01:00
Robin Rosenberg 94a2cbb407 Fix wrong javadoc comment in Repository
Change-Id: I9fc084b48418884ce1ccf16d56e800f1d3594885
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2010-12-19 11:03:23 +01:00
Robin Rosenberg 33c6eb848e Merge "Move TransferConfig to transport package" 2010-12-18 10:43:26 -05:00
Matthias Sohn 485917598e Qualify post 0.10 builds
Change-Id: Ifcb8fdea95286779c8aea6bf4d7647e8c1c98d63
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-17 15:49:30 +01:00
Matthias Sohn 67d8f3a338 Merge branch 'stable-0.10' 2010-12-17 15:41:27 +01:00
Matthias Sohn 51d1af9489 Qualify post 0.10.1 builds
Change-Id: I320f1f739f3689daf11d532a55ae1133785aec8e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-17 15:23:14 +01:00
Matthias Sohn 1fdc17bfe4 JGit 0.10.1
Change-Id: I4a46d35d354193e5d4f28ef7dfae75944be8ffcf
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-17 03:10:07 +01:00
Mathias Kinzler 73f36aa8f7 DirCacheCheckout: fix getToBeDeleted()
This wrongly returns the same as getConflicts()
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>

Change-Id: Id37c625458fc5a9b3987f05b684620e24fdfe852
2010-12-16 08:41:36 +01:00
Shawn O. Pearce 34454465c2 Move TransferConfig to transport package
This doesn't belong in the main lib package.

Change-Id: Idb20bf5849138b34a7277250fe0795c2a1f22447
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-15 17:04:03 -08:00
Shawn Pearce c19093bbad Merge "Do not rely on filemode differences in case of symbolic links" 2010-12-15 18:55:59 -05:00
Shawn O. Pearce 3922e026e0 FileBasedConfig: Use FileSnapshot for isOutdated()
Relying only on the last modified time for a file can be tricky.
The "racy git" problem may cause some modifications to be missed.

Use the new FileSnapshot code to track when a configuration file
has been modified, and needs to be reloaded in memory.

Change-Id: Ib6312fdd3b2403eee5af3f8ae711294b0e5f9035
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-15 15:14:05 -08:00
Shawn O. Pearce c8db22f355 Extract pack directory last modified check code
Pulling the last modified checking logic out of ObjectDirectory
makes it possible to reuse this code for other files, such as
the $GIT_DIR/config or $GIT_DIR/packed-refs files.

Change-Id: If2f27a89fc3b7adde7e65ff40bbca5d55b98b772
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-15 15:14:05 -08:00
Shawn O. Pearce 013cb8de38 Reduce calls to Repository.getConfig
Each time getConfig() is called on FileRepository, it checks the
last modified time of both ~/.gitconfig and $GIT_DIR?config.  If
$GIT_DIR/config appears to have been modified, it is read back in
from disk and the current config is wiped out.

When mutating a configuration file, this may cause in-memory edits
to disappear.  To avoid that callers need to avoid calling getConfig
until after the configuration has been saved to disk.

Unfortunately the API is still horribly broken.  Configuration should
be modified only while a lock is held on the configuration file, very
similar to the way a ref is updated via its locking protocol.  But our
existing API is really broken for that so we'll have to defer cleaning
up the edit path for a future change.

Change-Id: I5888dd97bac20ddf60456c81ffc1eb8df04ef410
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-15 15:14:05 -08:00
Shawn O. Pearce 86847ee322 Support GIT_SSH=tortoiseplink
The tortoiseplink command does not understand -batch, even though
it smells like the putty plink command that does use it.  Don't add
-batch if GIT_SSH is tortoiseplink.

Change-Id: I638532a02faa2caf8c39d482094e7ff4f4ec7e78
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-15 10:18:03 -08:00
Shawn O. Pearce 8efbd378e1 Correct plink -batch option
When GIT_SSH is set to use plink, the correct option name is "-batch"
and not "--batch".  This was a typo introduced when we added support
for plink via GIT_SSH.

Change-Id: I391660e38f5d208bba11e3f2a8f25922de2af878
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-15 10:17:01 -08:00
Philipp Thun bab053afdd Do not rely on filemode differences in case of symbolic links
When checking whether a file in the working tree has been modified -
WorkingTreeIterator.isModified() - we should not trust the filemode
in case of symbolic links, but check the timestamp and also the
content, if requested. Without this fix symlinks will always be shown
in EGit as modified files on Windows systems.

Change-Id: I367c807df5a7e85e828ddacff7fee7901441f187
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
2010-12-14 11:31:41 +01:00
Shawn O. Pearce 5ac5871d16 Simplify NoteParser use of prefix.length()
Sasa pointed out we only ever use the length here, so instead of
holding onto the AbbreviatedObjectId, lets just hold onto the length
as a primitive int.

Change-Id: I2444f59f9fe5ddcaea4a3537d3f1064736ae3215
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Sasa Zivkov <zivkov@gmail.com>
2010-12-13 16:01:39 -06:00
Shawn O. Pearce 2bc13104a8 Fix HTTP digest authentication
JGit's internal implementation of the HTTP digest authentication
method wasn't conforming to RFC 2617 (HTTP Authentication: Basic
and Digest Access Authentication), resulting in authentication
failures when connecting to a digest protected site.

The code now more accurately matches section 3.2.2 (The Authorization
Request Header) from the standards document.

Change-Id: If41b5c2cbdd59ddd6b2dea143f325e42cd58c395
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-13 10:11:03 -08:00
Matthias Sohn c6ca443b61 File utilities for creating directories
The java.io.File methods for creating directories report failure by
returning false. To ease proper checking of return values provide
utility methods wrapping mkdir() and mkdirs() which throw IOException
on failure.

Also fix the tests to store test data under a trash folder and cleanup
after test.

Change-Id: I09c7f9909caf7e25feabda9d31e21ce154e7fcd5
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-13 08:47:17 -06:00
Shawn O. Pearce 45a020fe6a DiffFormatter: Use IndexDiffFilter to speed up working tree
If DiffFormatter is asked to compare the index to the working tree,
it can go faster by using the cached stat information to compare
the two entries rather than relying on SHA-1 computation alone.

Change-Id: Icb21c15b8279ee8cee382e5e179e0cf8903aee4d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-10 17:17:22 -08:00
Mathias Kinzler 9b039b42e0 Rebase: abort on unknown/unsupported command in git-rebase-todo
This is needed to ensure interoperability with the command line: if
the git-rebase-todo file was created manually (by git rebase -i in the
command line), and any commands other than pick are used (reword,
edit, fixup, squash) JGit must abort as it does not understand these
commands yet.
The same is true if an unknown command is found (e.g. due to a typo);
this is the same behavior as shown by the command line.

Change-Id: I2322014f69460361f7fc09da223e8a5c31f100dd
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-12-10 09:44:51 +01:00
Shawn Pearce 93a7b2b24d Merge "IndexPack: Remove blob-streaming size threshold" 2010-12-09 19:33:58 -05:00
roberto 941b3d8a81 IndexPack: Remove blob-streaming size threshold
Always use streaming (for SHA-checksum & collision detection)
when indexing whole blobs, regardless of their size.

Positives:
* benefits of bugfix #312868 will apply to all runtimes, without
  additional conf for mem-constrained JVMs (5MB huge for some)
* no byte array allocation
  (re-uses readBuffer instead of allocating new full-size array)
* mildly better overall performance
  (given the usual blob-does-not-need-collision-checking case)
* removes unnecessary code

Negative:
* doubles the disk IO for a blob comparision
  (comparitively rare occurance)

I perf-tested a range of threshold sizes against a random selection
of packfiles I found on my harddrive, the results are here:

https://spreadsheets.google.com/ccc?key=tLCQElyyd2RKN9QevfvgwGQ&hl=en_GB#gid=1

My interpretation of the results is that the streaming size threshold
isn't beneficial (actually seems to be very slightly detrimental) -so
we should just get rid of it. This tallies with some of the comments
Shawn & I had for the default value of streamFileThreshold in the
review for I862afd4c:

http://egit.eclipse.org/r/#patch,sidebyside,2040,2,org.eclipse.jgit/src/org/eclipse/jgit/transport/IndexPack.java

The perf-test code is here: https://gist.github.com/735402
It's a bit scruffy but basically does 10 runs (in randomised order)
for each threshold size on various packfiles, waiting a second
between each pack-indexing to allow GC to catch up. I know it's not
perfect - proper perf testing is hard to do :-)
2010-12-09 23:46:47 +00:00
Chris Aniszczyk a3475fb664 Merge "Add option to skip deletion of non-existing files" 2010-12-09 18:31:48 -05:00
Chris Aniszczyk ec5116b09c Merge "Simplify logic in StrategySimpleTwoWayInCore" 2010-12-09 18:30:41 -05:00
Matthias Sohn cbd1ecff4d Add option to skip deletion of non-existing files
For convenience provide an option to skip deletion of non-existing
files. Also add some tests for deletion methods in FileUtils.

Change-Id: I33e355cfcdc19367d50208150ee49a4a06394890
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-10 00:21:23 +01:00
Shawn O. Pearce 33c670c1f0 Simplify logic in StrategySimpleTwoWayInCore
Sasa and I were reviewing this code today and Sasa pointed out we
can simplify the conflict logic, as the two cases (subtree and file)
are logically identical.

Change-Id: Ie0d40b2dd15605785eff453a846b1d20a2d021fc
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Sasa Zivkov <zivkov@gmail.com>
2010-12-09 10:55:43 -08:00
Mathias Kinzler 2a7cd0086b Rebase: fix wrong update if original HEAD after Merge+Skip
Rebase would update the original HEAD to the wrong commit when
"skipping" the last commit after a merged commit.

Includes a test for the specific situation.

Change-Id: I087314b1834a3f11a4561f04ca5c21411d54d993
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-12-09 19:22:11 +01:00
Christian Halstrick 1783749e16 Add a performance optimized variant of the ANY_DIFF filter
If a treewalk walks also over index and the workingtree then the
IndexDiffFilter filter can be used which works much faster then
the semantically equivalent ANY_DIFF filter. This is because this
filter can better avoid computing SHA-1 ids over the content of
working-tree files which is very costly.

This fix will significantly improve the performance of e.g.
EGit's commit dialog.

Change-Id: I2a51816f4ed9df2900c6307a54cd09f50004266f
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
2010-12-09 18:51:33 +01:00
Mathias Kinzler 6bca46e168 Implement rebase --continue and --skip
For --continue, the Rebase command asserts that there are no unmerged
paths in the current repository. Then it checks if a commit is needed.
If yes, the commit message and author are taken from the author_script
and message files, respectively, and a commit is performed before the
next step is applied.
For --skip, the workspace is reset to the current HEAD before applying
the next step.

Includes some tests and a refactoring that extracts Strings in the
code into constants.


Change-Id: I72d9968535727046e737ec20e23239fe79976179
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-12-09 16:10:21 +01:00
Shawn O. Pearce 18abb8195a IndexDiff: Remove unnecessary changesExist flag
Instead of setting a boolean when a difference record is found, return
false from diff() only if all of the collections are empty.  When all
of them are empty, no difference was found.

Change-Id: I555fef37adb764ce253481751071c53ad12cf416
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:20 -08:00
Shawn O. Pearce a66a7d90fd IndexDiff: Use isModified() when comparing index-worktree
The isModified() is more efficient because it can skip over files that
are stat clean, without needing to scan them.

This is useful to efficently work on paths that were already staged
and thus differ between HEAD and the index, but not between the index
and the working tree.

Change-Id: I4418202e612f0571974e0898050d987c6c280966
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:20 -08:00
Shawn O. Pearce d4bbb2e449 IndexDiff: Clean up tree-index compare for staged files
When comparing the ObjectIds for two tree entries its faster
to use the raw buffer compares over allocating ObjectIds and
then performing equals on their contents.

However, this also needs to consider the raw modes.  It is possible
for a path to change modes but not ObjectId (e.g. making a file
executable), and in this case its still a staged change to report back
to the caller.

Change-Id: I1a267254c04b3273a97f63c71d1e6718cd9d2fa8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:20 -08:00
Shawn O. Pearce e6c3922764 IndexDiff: Fix getAssumeUnchanged()
If the caller really needs the list of files that are flagged as
assume-unchanged (aka assume-valid in the DirCache), we should give
them the complete list and not just those that we wrongly identified
as being modified during diff().

This change is necessary because diff() is slightly broken and is
discovering differences on files that it shouldn't have considered.

Change-Id: Ibe464c1a0e51c19dc287a4bc5348b7b07f4d840b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:20 -08:00
Shawn O. Pearce 72f87adce6 IndexDiff: Correct Javadoc for getUntracked() method
Change-Id: I5f26c40dec5f0e4a47413af033dbedb0c252dd20
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:20 -08:00
Shawn O. Pearce 48e80698cf IndexDiff: Remove always true not-subtree check
The TreeWalk is configured to be recursive, which means subtrees are
never presented to the application.  Therefore the working tree file
mode can never be a subtree/subdirectory at this point in the code.

Change-Id: Ie842ddc147957d09205c0d2ce87b25c566862fd9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:20 -08:00
Shawn O. Pearce ca9baa0ee2 IndexDiff: Always use TreeWalk.getPathString()
Instead of asking the individual iterators for their path string, use
the TreeWalk's generic getPathString() method.  Its just as fast
because it uses the path of the current matching iterator.

Change-Id: I9b827fbbafce1c78f09d5527cdc64fbe9022a16e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:20 -08:00
Shawn O. Pearce f4e9c8890c IndexDiff: Simplify allocation of filter list
We add either 3 or 4 filters.  If we are adding only 3 filters,
allocating the array for 4 isn't a huge waste of memory, but it
does simplify our code.

Change-Id: I7df29b414f6d5cfcf533edb1405083e6fcec32cf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:20 -08:00
Shawn O. Pearce 11fd0fe03a Clarify WorkingTreeOptions and filemode usage
To improve runtime performance, caching the WorkingTreeOptions inside
of the Config object using the Config.SectionParser API allows
the WorkingTreeOptions to be accessed more efficiently whenever a
FileTreeIterator is constructed for the Repository.

Instead of passing the filemode handling option into isModified(),
the WorkingTreeIterator should always honor whatever setting has
been configured in this repository, as defined by its own copy of
the WorkingTreeOptions.  This simplifies all of the callers as they
no longer need to lookup core.filemode on their own.

A few locations were changed from always using a hardcoded "true"
on the file mode to passing what is actually configured in the
repository.  This is a behavior change, but corrects what should be
considered to be bugs as the core.filemode variable wasn't always
being used.

Change-Id: Idb176736fa0dc97af372f1d652a94ecc72fb457c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-08 10:03:19 -08:00
Shawn O. Pearce c181e1ab8a IndexPack: Use streaming for large whole blobs
When indexing large blobs that are stored whole (non-delta form),
avoid allocating the entire blob in memory and instead stream it
through the SHA-1 checksum computation.  This reduces the size
of memory required by IndexPack when processing very big blobs,
such as a 500 MiB uncompressable binary.

If the large blob already exists in the local repository, its
contents needs to be compared byte-for-byte after the entire pack
has been indexed, to ensure there isn't an unexpected SHA-1 collision
which may result in later data corruption.  This compare is performed
as a streaming compare, again avoiding the large object allocation.

This change doesn't improve on memory utilization for large objects
stored as deltas.  The change also doesn't improve handling for
any large commits, trees or annotated tags.  There isn't much to
be done here for those objects, because they need to be passed down
to the ObjectChecker as a byte[].  Fortunately it isn't common for
these object types to be that large,

Bug: 312868
Change-Id: I862afd4cb78013ee033d4ec68c067b1774a05be8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Roberto Tyley <roberto.tyley@guardian.co.uk>
2010-12-08 11:30:11 -06:00
Chris Aniszczyk bc1130c6aa Merge "Refactor IndexPack to use InputStream for inflation" 2010-12-08 11:19:51 -05:00
Christian Halstrick e3881de258 Removed unread parameters
Some method parameters in WorkingTreeIterator are never used. Remove
them. Especially the removal of the FS parameter in isModified()
simplifies upcoming performance optimizations.

Change-Id: I7c449589283a4a6b6e23f2586cd784febdca8bcd
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-08 10:15:48 -06:00
Shawn O. Pearce a02be9725c Remove empty iterator from TreeWalk
Its confusing that a new TreeWalk() needs to have reset() invoked
on it before addTree().  This is a historical accident caused by
how TreeWalk was abused within ObjectWalk.

Drop the initial empty tree from the TreeWalk and thus remove a
number of pointless reset() operations from unit tests and some of
the internal JGit code.

Existing application code which is still calling reset() will simply
be incurring a few unnecessary field assignments, but they should
consider cleaning up their code in the future.

Change-Id: I434e94ffa43491019e7dff52ca420a4d2245f48b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-07 16:49:51 -08:00
Shawn O. Pearce c94efa8286 Refactor IndexPack to use InputStream for inflation
By inflating with an InputStream like API, it is possible to stream
through large objects rather than allocating the entire thing as
a byte[].  This change only refactors the inflation code within
IndexPack to use a streaming interface.

Change-Id: I5a84b486901c2cf63fa6a3306dd5fb5c53b4056b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Roberto Tyley <roberto.tyley@guardian.co.uk>
2010-12-07 16:19:48 -08:00
Matthias Sohn 45731756a5 [findbugs] Do not ignore exceptional return value
java.io.File.delete() reports failure as an exceptional
return value false. Fix the code which silently ignored
this exceptional return value. Also remove some duplicate
deletion helper methods.

Change-Id: I80ed20ca1f07a2bc6e779957a4ad0c713789c5be
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-07 16:18:30 -08:00
Matthias Sohn e22f9552a8 Provide file utilities for file deletion
Provide file helper methods in a reusable utility class to
replace many local implementations. java.io.File has some
methods reporting failure by returning false. We prefer to
throw IOException on failure so that callers can't forget
checking the return value.

Change-Id: I430c77b5d2cffcf8b47584326ad4817a7291845e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-07 16:18:29 -08:00
Chris Aniszczyk db8cc4c84e Clean up Init API
Static accessors should come before a constructor.

Change-Id: Iee1051ce4f2038f19a08741e7a3a33f06a97a3c0
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-07 09:13:57 -06:00
Chris Aniszczyk 48b73efe1e Merge "Rebase Interoperability third part: handle stop upon conflict" 2010-12-07 09:34:25 -05:00
Chris Aniszczyk a51f44edb0 Merge "Rebase Interoperability second part: fix "pop steps"" 2010-12-07 09:19:35 -05:00
Mathias Kinzler ad96546ca0 Rebase Interoperability third part: handle stop upon conflict
There are some files that need to exist so that the CLI can continue
after the rebase has been stopped due to conflicts

Change-Id: I3cb4dc98609c059bf0cf9fd5f9e47a9c681cea2d
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-12-07 13:34:44 +01:00
Shawn Pearce 6462be8350 Merge "LockFile.commit: retry renaming" 2010-12-06 18:55:18 -05:00
Chris Aniszczyk a2469bb5d2 Merge "Add InitCommand" 2010-12-06 17:08:55 -05:00
Chris Aniszczyk 34554e4f1c Merge "Add debugging toString to TreeFormatter" 2010-12-06 10:11:11 -05:00
Chris Aniszczyk 6eb6d7c77a Merge "Add insert(TreeFormatter) to ObjectInserter" 2010-12-06 10:10:58 -05:00
Chris Aniszczyk 731f84559d Merge "Add toByteArray to CommitBuilder, TreeBuilder" 2010-12-06 10:10:41 -05:00
Chris Aniszczyk 35d51d040c Merge "Remove unused getTreeId from TreeFormatter" 2010-12-06 10:10:26 -05:00
Chris Aniszczyk 643de8323a Merge "Remove result id from CommitBuilder, TagBuilder" 2010-12-06 10:09:59 -05:00
Jens Baumgart cbf5ff6ac7 LockFile.commit: retry renaming
Currently the following can happen in LockFile.commit: deletion of the
original file succeeds but renaming fails afterwards. In this case the
original file (e.g. branch file in refs/heads) is lost.
To workaround the issue the same retry logic as for file deletion is
applied to file renaming.

Bug: 331890
Change-Id: I68620c07f2d3ab7f3279c71a91e184e8eac69832
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
2010-12-06 13:40:07 +01:00
Chris Aniszczyk 90fbc1db3a Merge "Honor GIT_SSH when opening SSH connections" 2010-12-05 20:14:46 -05:00
Chris Aniszczyk f7a566c1aa Add InitCommand
Adds git-init support to the Git API.

Change-Id: I1428b861f22cabe4d92cadf3d9114dddeec75b40
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-05 19:01:43 -06:00
Shawn O. Pearce ed7e38b98d Merge "Ensure stable tag sort in PlotWalk" 2010-12-05 18:10:12 -05:00
Chris Aniszczyk ef11143ffe Merge "Abstract SSH setup to support GIT_SSH" 2010-12-05 10:50:05 -05:00
Shawn O. Pearce 064ecc25ce Fix findGitDir() with no ceiling directories
Bug: 322866
Change-Id: I64205bb0315a725dfa523ccff1796de50f465162
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Ketan Padegaonkar <KetanPadegaonkar@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-05 15:42:57 +01:00
Matthias Sohn c474813b0a Merge "Correct CommitBuilder, TagBuilder method to be build()" 2010-12-05 08:19:58 -05:00
Robin Rosenberg 40c2f68382 Merge "Fix checking out large files" 2010-12-04 03:49:11 -05:00
Shawn O. Pearce 864091d982 Ensure stable tag sort in PlotWalk
Because tags are more interesting here than local or remote branch
heads, tags get sorted earlier in the array than heads or remotes do.

Bug: 324939
Change-Id: Ifc3863461654df7f34fdecbd2abe1f4b5d2ffb8e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Mathias Kinzler <mathias.kinzler@sap.com>
CC: Stefan Lay <stefan.lay@sap.com>
2010-12-03 16:38:24 -08:00
Shawn O. Pearce 61db0e4787 Fix checking out large files
DirCacheCheckout needs to use ObjectLoader.copyTo to avoid loading the
complete content of a large file into the JVM heap.

Bug: 321097
Change-Id: I967590b6f233fd1c83d873075db01d653208b3b9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Chris Aniszczyk <caniszczyk@gmail.com>
CC: Christian Halstrick <christian.halstrick@sap.com>
2010-12-03 16:37:56 -08:00
Shawn O. Pearce 22e720ce77 Honor GIT_SSH when opening SSH connections
If the environment variable GIT_SSH is set, use GIT_SSH for any remote
protocol connections, instead of the local JSch library.

Bug: 321062
Change-Id: Ia18ea49d58f3ed657430067f1f72ef788a2dae4c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-03 16:33:46 -08:00
Shawn O. Pearce 04b289cc42 Abstract SSH setup to support GIT_SSH
In order to honor GIT_SSH the TransportGitSsh class needs to run the
process named by the GIT_SSH environment variable and use that as the
pipes for connectivity to the remote peer.  Refactor the current
transport code to support a different type of pipe connectivity, so we
can later add GIT_SSH.

Bug: 321062
Change-Id: I9d8ee1a95f1bac5013b33a4a42dcf1f98f92172f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-03 16:14:46 -08:00
Matthias Sohn 6ca9fd2d95 Add missing license header
Change-Id: Ibfd17951606f02283660befcff53ff9b73405dd9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-03 22:37:46 +01:00
Shawn O. Pearce 8fd2335b70 Add debugging toString to TreeFormatter
Displaying the current tree in the ls-tree style output makes it
easier to see what entries are currently stored.

Change-Id: If17c414db0d2e8d84e65de8bbcba7fd1b79aa311
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-03 13:11:39 -08:00
Shawn O. Pearce 8d4c95a645 Add insert(TreeFormatter) to ObjectInserter
This makes usage of a TreeFormatter more similar to a CommitBuilder or
a TagBuilder: populate the formatter and pass to the ObjectInserter.

Change-Id: I5a45ef3a35cc73f4905a34bc6f6228510df8eb2c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-03 13:03:12 -08:00
Shawn O. Pearce 9ad802c15b Add toByteArray to CommitBuilder, TreeBuilder
This better matches the existing API of TreeFormatter, but is just a
simple delegation to build().

Change-Id: I188f43acc34455e773d63836724b05e18f5c7a84
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-03 12:57:41 -08:00
Shawn O. Pearce 807ee4797f Remove unused getTreeId from TreeFormatter
Change-Id: If5955757575d4c6053b6f8109e9dc2ecb0502446
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-03 12:47:37 -08:00
Shawn O. Pearce cf52ef5531 Remove result id from CommitBuilder, TagBuilder
These objects don't need to be updated with the resulting ObjectId of
the formatted content, callers can get that from the ObjectInserter on
their own.

Change-Id: Idc5f097de9f7beafc5e54e597383d82daf9d7db4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-03 12:38:31 -08:00
Shawn O. Pearce f996fb1796 Correct CommitBuilder, TagBuilder method to be build()
The correct names for these is build(), as that is what a Java
developer will expect given the "builder" pattern.

Bug: 323541
Change-Id: I35042bdc95a955beeaee29e54bde10e4240b2a71
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reviewed-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-12-03 12:28:00 -08:00
Matthias Sohn 37001ddc8d Fix jgit build broken by deabacc4
Since 049827d7 MergeAlgorithm isn't static anymore.

Change-Id: I3d704f663a776bb57e59f28a8200753fae5e9d25
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-12-03 09:24:31 +01:00
Chris Aniszczyk 39fe52ccc7 Merge "Rebase Interoperability first part: write "interactive" file" 2010-12-02 21:19:10 -05:00
Chris Aniszczyk b5f9a9b4d3 Merge "Fixed Merge Algorithm regarding concurrent file creations" 2010-12-02 20:19:04 -05:00
Christian Halstrick deabacc420 Fixed Merge Algorithm regarding concurrent file creations
When in OURS and THEIRS a new file is created we want a conflict
when the two contents differ. If on two branches the same file
with the same content is created this should not be a conflict.
But: the current merge algorithm is throwing NPEs in this case.
Fix this by choosing an empty RawText as common base if the
base is empty.

Change-Id: I21cb23f852965b82fb82ccd66ec961c7edb3ac3d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-12-02 13:15:59 +01:00
Shawn O. Pearce e0a9961b78 Avoid unnecessary decoding of length in PackFile
If the object type is a whole object and all we want is the type,
there is no need to skip the length header.  The type is already known
and can be returned as-is.  Instead skip the length header only for
the two delta formats, where the delta base must itself be scanned.

Change-Id: I87029258e88924b3e5850bdd6c9006a366191d10
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-12-01 09:59:55 -08:00
Shawn O. Pearce d29b5db695 Remove unused 'shift' variable from PackFile
This variable was not used for anything, but Eclipse's JDT failed to
notice because of the "shift += " operation within the body of the
while loop.  Here we don't need the shift because we do not decode the
length, but we do have to skip over the bytes that store the length to
locate the delta base.

Bug: 331319
Change-Id: I200a874fd7e39e3adf2640b8cd0f53dcf91ef4c9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Remy Suen <remysuen@ca.ibm.com>
2010-12-01 09:57:16 -08:00
Mathias Kinzler 59e62ba7e1 Rebase Interoperability second part: fix "pop steps"
If the CLI stops a rebase upon conflict, the current
step is already popped from the git-rebase-todo and appended to the
"done" file. The current implementation wrongly pops the step only
after successful cherry-pick.

Change-Id: I8640dda0cbb2a5271ecf75fcbad69410122eeab6
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-12-01 15:10:13 +01:00
Mathias Kinzler 7aa1b85821 Rebase Interoperability first part: write "interactive" file
The Repository is then in state "Rebase interactive".

Change-Id: I5d2de57f8670e1d4c71ed22509ab17f04e2561b5
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-12-01 15:08:07 +01:00
Stefan Lay b4359cb829 Include list of assume unchanged files in IndexDiff
The IndexDiff had not collected the info if the flag
"assume-unchanged" is set. This information is useful for clients
which may want to decide if specific actions are allowed on a file.

Bug: 326213
Change-Id: I14bb7b03247d6c0b429a9d8d3f6b10f21d8ddeb1
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-11-30 10:51:21 -08:00
Stefan Lay 7bf0f5070e Use the Set interface in declarations and as return value
Change-Id: Ib273c4980036f75bd4dad3ffe1c29a37b2df932a
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-11-30 11:05:42 +01:00
Shawn Pearce a115b64f4b Merge "Check assume unchanged flag in Add command" 2010-11-29 18:21:08 -05:00
Shawn Pearce f968cbabcf Merge "Fix DiffConfig to understand "copy" resp. "copies" for diff.renames property." 2010-11-29 17:59:15 -05:00
Stefan Lay 9225b88ae6 Check assume unchanged flag in Add command
When the assume unchanged flag is set the Add command must not update
the index for this file if any changes are present in the working
directory.

Bug: 331351
Change-Id: I255870f689225a1d88971182e0eb377952641b42
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
2010-11-29 17:58:38 +01:00
Marc Strapetz e147fbcd66 Fix DiffConfig to understand "copy" resp. "copies" for diff.renames property.
Rename detection should be considered enabled if
diff.renames config property is set to "copy" or "copies", instead of
throwing IllegalArgumentException.

Change-Id: If55d955e37235d4d00f5b0febd6aa10c0e27814e
2010-11-29 17:14:07 +01:00
Mathias Kinzler 12b6350435 RebaseCommand: trim line endings when reading files
In order to enable interoperability with the command line, we need to
remove line feeds when reading the files.

Change-Id: Ie2f5799037a60243bb4fac52346908ff85c0ce5d
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
2010-11-26 12:22:40 +01:00
Christian Halstrick 12a5c8d413 Change default diff algorithm to histogram and add tests
The referenced bug showed that JGit produced different merge results
compared to C Git. Unit test was added to reproduce the issue. The
problem can be solved by switching to histogram diff algorithm.

Bug: 331078
Change-Id: I54f30afb3a9fef1dbca365ca5f98f4cc846092e3
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
2010-11-26 00:44:05 +01:00
Christian Halstrick 049827d708 Make diff algorithm configurable
The diff algorithm which is used by Merge, Cherry-Pick, Rebase
should be configurable. A new configuration parameter "diff.algorithm"
is introduced which currently accepts the values "myers" or
"histogram". Based on this parameter for example the ResolveMerger
will choose a diff algorithm. The reason for this is bug 331078.
This bug shows that JGit is more compatible with C Git when
histogram diff is in place. But since histogram diff is quite new we
need an easy way to fall back to Myers diff.

Bug: 331078
Change-Id: I2549c992e478d991c61c9508ad826d1a9e539ae3
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
2010-11-26 00:30:08 +01:00
Christian Halstrick 7e298c9ed5 Add more tests for rebase and externalized missing Strings
Coverage tests showed that we are missing to test certain areas
in the rebase command. Add the missing tests.

Change-Id: Ia4a272d26cde7e1861dac30496e4b6799fc8187a
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-11-24 15:59:08 +01:00
Chris Aniszczyk 923443f94f Add CheckoutCommand
Add the ability to checkout a branch to the working tree.

Bug: 330860
Change-Id: Ie06b9e799a9e1be384da0b8996efa7209b32eac3
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-11-22 15:53:35 -06:00
Matthias Sohn 34962b4700 Merge "Fix bug regarding handling of non-versioned files during merge" 2010-11-22 16:43:43 -05:00
Christian Halstrick 5adef23365 Fix bug regarding handling of non-versioned files during merge
There was a bug introduced by commit 0e815fe. For non-versioned files
the merge algorithm detected an incoming deletion from THEIRS.
Consequently such files were deleted. That's a severe bug which was
fixed by more precisely detecting incoming deletions.

Change-Id: I4385d3c990db11d62e371a385dc8ee89841db84a
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2010-11-22 22:41:25 +01:00
Chris Aniszczyk f7690cceef Add RmCommand to Git API
Bug: 330827
Change-Id: I0b74bb92254d0ee988139d25022d06d16ed89d58
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-11-22 11:02:28 -06:00
Mathias Kinzler e5b96a7848 Initial implementation of a Rebase command
This is a first iteration to implement Rebase. At the moment, this
does not implement --continue and --skip, so if the first
conflict is found, the only option is to --abort the command.

Bug: 328217
Change-Id: I24d60c0214e71e5572955f8261e10a42e9e95298
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-11-22 09:58:36 -06:00
Shawn O. Pearce bd98a0a9a5 Move WorkingTreeIterator inherited state into an object
Instead of copying up to 4 fields from the parent iterator each time a
child iterator is initialized and used, construct a single state
object that contains the 4 fields, and pass that one state object
through to the child.  This makes it easier to add additional state
fields that must be inherited, at the slight expense of an extra
object allocation per TreeWalk, and an extra level of field
indirection whenever the options, nameEncoder, or read buffer is
required by the iterator.

Change-Id: Ic4603c33b772d7a45f9c81140537d51945688fcb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-18 17:06:12 -08:00
Shawn O. Pearce 3de186fbf0 Name TreeFilter and MergeFilter implementations
Naming these inner classes ensures that stack traces which contain
them will give us useful information about which filter is involved in
the trace, rather than the generated names $1, $2, etc.  This makes it
much easier to understand a stack trace at a glance.

Change-Id: Ia6a75fdb382ff6461e02054d94baf011bdeee5aa
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-18 16:50:14 -08:00
Chris Aniszczyk 2054c3fb8a Add core.filemode to CoreConfig
Let CoreConfig cache the value of core.filemode so
clients like EGit can take advantage of it.

Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-11-14 18:54:36 -06:00
Christian Halstrick da1ea27fa2 Fixed checkouts when HEAD is ignored
In the case where DirCacheCheckout was used to checkout a tree
without taking HEAD into account (e.g. during a clone or hard reset)
we didn't handle conflicts correctly. E.g. if there are conflicts
(entries with stage != 0) in the index and we tried to hard reset
we have been processing the conflicting pathes multiple times (once
for every stage). With this fix we will update the index with the
entry from the "merge" state (the state we want checkout) when we
detect existing conflicts.

Change-Id: Iffbddccaa588cf0d1460a5e44dabaf540d996e26
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-11-13 11:42:13 -06:00
Chris Aniszczyk 952c4e1f3d Merge "Base64: Reformat to match JGit style" 2010-11-13 12:40:05 -05:00
Chris Aniszczyk 07cabc8c6f Merge "Base64: Strip out code JGit doesn't use" 2010-11-13 12:39:48 -05:00
Chris Aniszczyk f638679797 Merge "Remove unnecessary note fanout when removing notes" 2010-11-13 12:38:17 -05:00
Chris Aniszczyk 1b3abe75f8 Merge "Split note leaf buckets at 256 elements" 2010-11-13 12:37:30 -05:00
Chris Aniszczyk 9f2bde653f Merge "Add internal API for note iteration" 2010-11-13 12:32:59 -05:00
Chris Aniszczyk e9002a45ce Merge "Allow writing a NoteMap back to the repository" 2010-11-13 12:31:58 -05:00
Chris Aniszczyk 56a802104a Merge "Add in-memory updating support to NoteMap" 2010-11-13 12:31:02 -05:00
Chris Aniszczyk 43156bf045 Merge "Remember non-note tree entries when reading" 2010-11-13 12:29:31 -05:00
Shawn O. Pearce 51bf8ea2a4 Merge branch 'rename-detection'
* rename-detection:
  RenameDetector: Only scan deletes if adds exist
  SimilarityRenameDetector: Initialize sizes to 0
  SimilarityRenameDetector: Avoid allocating source index
  SimilarityRenameDetector: Only attempt to index large files once
  SimilarityIndex: Don't overflow internal counter fields
  SimilarityIndex: Accept files larger than 8 MB
  SimilarityIndex: Correct comment explaining the logic
2010-11-12 16:15:43 -08:00
Shawn O. Pearce c35f98b226 Merge branch 'fs-fsync'
* fs-fsync:
  Remove unnecessary flush calls from LockFile
  Remove unnecessary region locking from LockFile
  Support core.fsyncRefFiles option
  Support core.fsyncObjectFiles option
  Simplify LockFile write(ObjectId) case
2010-11-12 16:12:27 -08:00
Shawn O. Pearce ef70a12fd1 Base64: Reformat to match JGit style
Rewrite the initialization of the encoding tables to be more clear,
but slightly slower to setup.  We generally perfer a clear definition
of the data over a slightly slower class load time.

Change-Id: I0c7f89b6ab82dcf71525ffb69a388c312c195913
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 16:05:00 -08:00
Shawn O. Pearce d2ce91199e Base64: Strip out code JGit doesn't use
Since we have already modified this class to localize an error
message, we might as well strip it down to contain only the
functionality we need, or might ever use.

To keep this simple to review we don't adjust formatting right
away, so code that was buried inside of an if or else block whose
condition was removed might not have the correct indentation anymore.
We can fix this with a later reformatting change.

Change-Id: I2996aaa704e9d6182e5500c7a63240d5e9d722cc
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 16:01:05 -08:00
Christian Halstrick 484807e82b Added one-tree constructor to DirCacheCheckout
When DirCacheCheckout should be used to checkout only one
tree (reset --hard, clone) then we had to use the standard
constructor and specify null as value for head. This change
adds explicit constructors not taking HEAD and documents
that.

Bug: 330021
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2010-11-13 00:45:50 +01:00
Shawn O. Pearce e7e9a47b52 Remove unnecessary note fanout when removing notes
Fanout level notes trees are combined back together into a flat leaf
level tree if during a removal of a subtree there are less than 3/4 of
the fanout subtrees still existing, and the size of the combined leaf
is under the 256 split limit noted above.

This rule is used because deletes are less common than insertions, and
SHA-1's relatively uniform distribution suggests that with only 192
subtrees existing in the fanout, there should be approximately 192
names in the combined replacement leaf tree.

Change-Id: Ia9d145ffd5454982509fc40906bc4dbbf2b13952
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 14:01:28 -08:00
Shawn O. Pearce 2b0df15f7f Split note leaf buckets at 256 elements
Leaf level notes trees are split into a new fan-out tree if an
insertion occurs and the tree already contains >= 256 notes in it.

The splitting may occur multiple times if all of the notes have the
same prefix; in the worst case this produces a tree path such as
"00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/be" if all
of the notes begin with zeros.

Change-Id: I2d7d98f35108def9ec49936ddbdc34b13822a3c7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 14:01:28 -08:00
Shawn O. Pearce 3728918d72 Add internal API for note iteration
Some algorithms need to be able to iterate through all notes within a
particular bucket, such as when splitting or combining a bucket.
Exposing an Iterator<Note> makes this traversal possible.

For a LeafBucket the iteration is simple, its over the sorted array of
elements.  For FanoutBucket its a bit more complex as the iteration
needs to union the iterators of each fanout bucket, lazily loading any
buckets that aren't already in-memory.

Change-Id: I3d5279b11984f44dcf0ddb14a82a4b4e51d4632d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 14:01:28 -08:00
Shawn O. Pearce 3e2b9b691e Allow writing a NoteMap back to the repository
This is necessary to allow applications to wrap the note tree in
a commit and update the note branch with the new state.

Change-Id: Idbd7ead4a1b16ae2b64a30a4a01a29cfed548cdf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 14:01:28 -08:00
Shawn O. Pearce faa0747cce Add in-memory updating support to NoteMap
NoteMap now supports editing in-memory, allowing applications to
modify the NoteMap once it has been loaded from the branch.  The
ability to write the branch back to tree objects is not yet done,
so the edits are strictly transient.

Change-Id: I63448954abfca2a8e3e95369cd84c0d1176cdb79
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 14:01:24 -08:00
Shawn O. Pearce 2f6e79307d Remove unnecessary flush calls from LockFile
Change-Id: I144af9db4714acabd796880be73bd50d84b92efe
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 13:38:13 -08:00
Shawn O. Pearce ed5fe8af9a Remove unnecessary region locking from LockFile
The lock file protocol relies on the atomic creation of a standardized
name in the parent directory of the file being updated.  Since the
creation is atomic, at most one thread in any process can succeed on
this creation, and all others will fail.  While the lock file exists,
that file is private to the thread that is writing it, and no others
will attempt to read or modify the file.

Consequently the use of the region level locks around the file are
unnecessary, and may actually reduce performance when using NFS, SMB,
or some other sort of remote filesystem that supports locking.

Change-Id: Ice312b6fb4fdf9d36c734c3624c6d0537903913b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 13:38:06 -08:00
Shawn O. Pearce e0e7fe531d Support core.fsyncRefFiles option
If core.fsyncRefFiles is set to true, fsync is used whenever a
reference file is updated, ensuring the file contents are also
written to disk.  This can help to prevent empty ref files after
a system crash when using a filesystem such as HFS+ where data
writes may be delayed.

Change-Id: Ie508a974da50f63b0409c38afe68772322dc19f1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 13:38:04 -08:00
Shawn O. Pearce 24fccadeda Support core.fsyncObjectFiles option
Some repositories may be on really unstable filesystems, but still
want to have good reliability when objects are written to disk.  If
core.fsyncObjectFiles is set to true, request the JVM to ensure the
data is written before returning success to the caller of insert.

The option defaults to false because it should be useless on any
filesystem that orders writes and metadata, such as ext3 mounted with
data=ordered (or data=journal).  But it may be useful on some systems
(especially HFS+) where file content may flush to the disk
independently of filesystem structure changes.

Because FileChannel.force(boolean) only claims to ensure data is
written if it was written using the write(ByteBuffer) method of
FileChannel, redirect all writes when using fsyncObjectFiles to go
through the FileChannel interface instead of through the older style
OutputStream interface.  This may not be necessary on all JVMs, but
its more portable to follow the definition than the common behavior.

Change-Id: I57f6b6bb7e403c07fbae989dbf3758eaf5edbc78
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 13:37:27 -08:00
Shawn O. Pearce bc9bca064d RenameDetector: Only scan deletes if adds exist
If there are only deletes, don't need perform rename or copy
detection.  There are no adds (aka destinations) for the deletes
to match against.

Change-Id: I00fb90c509fa26a053de561dd8506cc1e0f5799a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 11:57:02 -08:00
Shawn O. Pearce 05653bda04 SimilarityRenameDetector: Initialize sizes to 0
Setting the array elements to -1 is more expensive than relying on
the allocator to zero the array for us first.  Shifting the code to
always add 1 to the size (so an empty file is actually 1 byte long)
allows us to detect an unloaded size by comparing to 0, thus saving
the array fill calls.

Change-Id: Iad859e910655675b53ba70de8e6fceaef7cfcdd1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 11:57:02 -08:00
Shawn O. Pearce 68baa3097e SimilarityRenameDetector: Avoid allocating source index
If the only file added is really small, and all of the deleted
files are really big, none of the permutations will match up due
to the sizes being too far apart to fit the current rename score.

Avoid allocating the really big deleted SimilarityIndex by deferring
its construction until at least one add along that row has a
reasonable chance of matching it.

This avoids expending a lot of CPU time looking at big deleted
binary files when a small modified text file was broken due to a
high percentage of changed lines.

Change-Id: I11ae37edb80a7be1eef8cc01d79412017c2fc075
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 11:57:02 -08:00
Shawn O. Pearce 918e6e20f0 SimilarityRenameDetector: Only attempt to index large files once
If a file fails to index the first time the loop encounters it, the
file is likely to fail to index again on the next row.  Rather than
wasting a huge amount of CPU to index it again and fail, remember
which destination files failed to index and skip over them on each
subsequent row.

Because this condition is very unlikely, avoid allocating the BitSet
until its actually needed.  This keeps the memory usage unaffected
for the common case.

Change-Id: I93509b28b61a9bba8f681a7b4df4c6127bca2a09
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 11:57:02 -08:00
Shawn O. Pearce 0e307a6afd SimilarityIndex: Don't overflow internal counter fields
The counter portion of each pair is only 32 bits wide, but is part
of a larger 64 bit integer.  If the file size was larger than 4 GB
the counter could overflow and impact the key, changing the hash,
and later resulting in an incorrect similarity score.

Guard against this overflow condition by capping the count for each
record at 2^32-1.  If any record contains more than that many bytes
the table aborts hashing and throws TableFullException.

This permits the index to scan and work on files that exceed 4 GB
in size, but only if the file contains more than one unique block.
The index throws TableFullException on a 4 GB file containing all
zeros, but should succeed on a 6 GB file containing unique lines.

The index now uses a 64 bit accumulator during the common scoring
algorithm, possibly resulting in slower summations.  However this
index is already heavily dependent upon 64 bit integer operations
being efficient, so increasing from 32 bits to 64 bits allows us
to correctly handle 6 GB files.

Change-Id: I14e6dbc88d54ead19336a4c0c25eae18e73e6ec2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 11:57:02 -08:00
Shawn O. Pearce d63887127e SimilarityIndex: Accept files larger than 8 MB
Files bigger than 8 MB (2^23 bytes) tended to overflow the internal
hashtable, as the table was capped in size to 2^17 records.  If a
file contained 2^17 unique data blocks/lines, the table insertion
got stuck in an infinite loop as the able couldn't grow, and there
was no open slot for the new item.

Remove the artifical 2^17 table limit and instead allow the table
to grow to be as big as 2^30.  With a 64 byte block size, this
permits hashing inputs as large as 64 GB.

If the table reaches 2^30 (or cannot be allocated) hashing is
aborted.  RenameDetector no longer tries to break a modify file pair,
and it does not try to match the file for rename or copy detection.

Change-Id: Ibb4d756844f4667e181e24a34a468dc3655863ac
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 11:56:59 -08:00
Shawn O. Pearce f3b511568b SimilarityIndex: Correct comment explaining the logic
This comment was wrong, due to a copy-and-paste error.  Here the
code is looking at records of dst that do not exist in src, and
are skipping past them to find another match.

Change-Id: I07c1fba7dee093a1eeffcf7e0c7ec85446777ffb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-12 11:56:57 -08:00
Shawn Pearce e8315ce19d Merge "Fix null ref exception in DirCacheCheckout" 2010-11-12 11:29:32 -05:00
Shawn O. Pearce 5a2cbd4aa7 Remember non-note tree entries when reading
In order to safely edit a notes tree, NoteMap needs to retain any
non-note tree entries it read from the source tree and put them
back out into the modified tree when it commits a new version of
the note branch.

Remember any tree entries that didn't look like a note during
the parsing of the tree, so they can be put into a TreeFormatter
later when the tree writes to the repository.

Change-Id: Ia284af7e7866da35db35374c6c5869f00c857944
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-11 10:57:16 -08:00
Shawn O. Pearce b81b97fbdd Lazy load note subtrees from fanout levels
Instead of reading a note tree recursively up front when the NoteMap
is loaded, read only the root tree and load subtrees on demand when
they are accessed by the application.  This gives a lower latency
to read a note for the recent commits on a branch, as only the paths
that are needed get read.

Given a 2/38 style fanout, the tree will fully load when 256 objects
have been accessed by the application.  But unlike the prior version
of NoteMap, the NoteMap will load faster and answer lookups sooner,
as the loading time for all 256 levels is spread out across each of
the get() requests.

Given a 2/2/36 style fanout, the tree won't need to fully load until
about 65,536 objects are accessed.

To simplify the implementation we only support the flat layout (all
notes in the top level tree), or a 2/38, 2/2/36, 2/2/2/34, through
2/.../2 style fanout.  Unlike C Git we don't support reading the old
experimental 4/36 fanout.  This is sufficient because C Git won't
create the 4/36 style fanout when creating or updating a notes tree,
and there really aren't any in the wild today.

Change-Id: I6099b35916a8404762f31e9c11f632e43e0c1bfd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-11 10:23:38 -06:00
Shawn O. Pearce 936820988f Define NoteMap, a simple note tree reader
The NoteMap makes it easy to read a small notes tree as created by
the `git notes` command in C Git.  To make the initial implementation
simple a notes tree is read recursively into a map in memory.
This is reasonable if the application will need to access all notes,
or if there are less than 256 notes in the tree, but doesn't behave
well when the number of notes exceeds 256 and the application
doesn't need to access all of them.

We can later add support for lazily loading different subpaths,
thus fixing the large note tree problem described above.

Currently the implementation only supports reading.  Writing notes
is more complex because trees need to be expanded or collapsed at
the exact 256 entry cut-off in order to retain the same tree SHA-1
that C Git would use for the same content.  It also needs to retain
non-note tree entries such as ".gitignore" or ".gitattribute" files
that might randomly appear within a notes tree.  We can also add
writing support later.

Change-Id: I93704bd84ebf650d51de34da3f1577ef0f7a9144
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-11-11 10:06:43 -06:00
Chris Aniszczyk 6043d4638c Merge "Add MutableObjectId setByte to modify a mutable id" 2010-11-11 10:52:37 -05:00
Chris Aniszczyk 573666403d Merge "Support CredentialsProvider for SSH connections" 2010-11-11 10:27:52 -05:00
Stefan Lay 33c419fdfe Merge "Define a default CredentialsProvider" 2010-11-11 09:36:34 -05:00
Stefan Lay dcac1fe4bf Merge "Enable providing credentials for HTTP authentication" 2010-11-11 09:35:43 -05:00
Chris Aniszczyk 9e28cf2fa3 Merge "Add ObjectId getByte for random access" 2010-11-10 18:00:36 -05:00
Shawn O. Pearce d279bc83b0 Support CredentialsProvider for SSH connections
When setting up an SSH connection, use the caller supplied
CredentialsProvider, if one has been given to the Transport
or was defined as the default.

The CredentialsProvider is re-wrapped as a JSch UserInfo,
allowing the connection to use this for user interactive
prompts.  This give a unified API for authentication on
any transport type.

Change-Id: Id3b4cf5bfd27a23207cdfb188bae3b78e71e02c0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-10 15:00:13 -08:00
Shawn O. Pearce ce99b48384 Define a default CredentialsProvider
This permits applications to set their preferred credentials UI
implementation once, rather than needing to define it on every
single Transport instance they open.

Change-Id: I010550de1a6becab27f7aa5a9901df5a1c7e74bd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-11-10 14:58:45 -08:00