The method canAmend was added to RepositoryState. It returns true if
amending the HEAD commit is allowed in the current repository state.
Change-Id: Idd0c4eea83a23c41340789b7b877959b457d951e
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
When SSH user/password authentication failed this may have been caused
by changed credentials on the server side. When the SSH credentials of a
user change the SSH connection needs to be re-established and
credentials which may have been stored by the credentials provider
need to be reset in order to enable prompting for the new credentials.
Bug: 356233
Change-Id: I7d64c5f39b68a9687c858bb68a961616eabbc751
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Moves the check from inside the loop to outside the loop
and returns immediately if the HEAD advertisded ref is null
Change-Id: I539da6cafb4f73610b8e00259e32bd4d57f4f4cc
Commit 13931236b9ee2895a98ffdbdacbd0f895956d8a8 in C Git (2011-11-02)
changed the message format:
-Merge remote branch 'origin/foo'
+Merge remote-tracking branch 'origin/foo'
This change does the same in EGit to be consistent.
Change-Id: I7d9c5afa95771dbfe6079b5f89a10b248fee0172
Signed-off-by: Robin Stocker <robin@nibor.org>
Throw a NoHeadException when Repository.getFullBranch
returns null
Bug: 351543
Change-Id: I666cd5b67781508a293ae553c6fe5c080c8f4d99
Signed-off-by: Kevin Sawicki <kevin@github.com>
ReceivePack (and PackParser) can be configured with the
maxObjectSizeLimit in order to prevent users from pushing too large
objects to Git. The limit check is applied to all object types
although it is most likely that a BLOB will exceed the limit. In all
cases the size of the object header is excluded from the object size
which is checked against the limit as this is the size of which a BLOB
object would take in the working tree when checked out as a file.
When an object exceeds the maxObjectSizeLimit the receive-pack will
abort immediately.
Delta objects (both offset and ref delta) are also checked against the
limit. However, for delta objects we will first check the size of the
inflated delta block against the maxObjectSizeLimit and abort
immediately if it exceeds the limit. In this case we even do not know
the exact size of the resolved delta object but we assume it will be
larger than the given maxObjectSizeLimit as delta is generally only
chosen if the delta can copy more data from the base object than the
delta needs to insert or needs to represent the copy ranges. Aborting
early, in this case, avoids unnecessary inflating of the (huge) delta
block.
Unfortunately, it is too expensive (especially for a large delta) to
compute SHA-1 of an object that causes the receive-pack to abort.
This would decrease the value of this feature whose main purpose is to
protect server resources from users pushing huge objects. Therefore
we don't report the SHA-1 in the error message.
Change-Id: I177ef24553faacda444ed5895e40ac8925ca0d1e
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This allows callers to determine why the revert
did not complete successfully
Change-Id: Ie44bb8523cac388b63748bc69ebdd3c3a3665d06
Signed-off-by: Kevin Sawicki <kevin@github.com>
The DfsReader must offer every representation of an object that
exists on the local repository when PackWriter asks for them. This
is necessary to identify objects in the thin pack part that are also
in the cached pack that will be appended onto the end of the stream.
Without looking at all alternatives, PackWriter may pack the same
object twice (once in the thin section, again in the cached base
pack). This may cause the command line C version to go into an
infinite loop when repacking the resulting repository, as it may see
a delta chain cycle with one of those duplicate copies of the object.
Previously the DfsReader tried to avoid looking at packs that it
might not care about, but this is insufficient, as all versions
must be considered during pack generation.
Change-Id: Ibf4a3e8ea5c42aef16404ffc42a5781edd97b18e
Consider two objects A->B where A uses B as a delta base, and these
are in the same source pack file ordered as "A B".
If cached packs is enabled and B is also in the cached pack that
will be appended onto the end of the thin pack, and both A, B are
supposed to be in the thin pack, PackWriter must consider the fact
that A's base B is an edge object that claims to be part of the
new pack, but is actually "external" and cannot be written first.
If the object reuse system considered B candidates fist this bug
does not arise, as B will be marked as edge due to it existing in
the cached pack. When the A candidates are later examined, A sees a
valid delta base is available as an edge, and will not later try to
"write base first" during the writing phase.
However, when the reuse system considers A candidates first they
see that B will be in the outgoing pack, as it is still part of
the thin pack, and arrange for A to be written first. Later when A
switches from being in-pack to being an edge object (as it is part
of the cached pack) the pointer in B does not get its type changed
from ObjectToPack to ObjectId, so B thinks A is non-edge.
We work around this case by also checking that the delta base B
is non-edge before writing the object to the pack. Later when A
writes its object header, delta base B's ObjectToPack will have
an offset == 0, which makes isWritten() = false, and the OBJ_REF
delta format will be used for A's header. This will be resolved by
the client to the copy of B that appears in the later cached pack.
Change-Id: Ifab6bfdf3c0aa93649468f49bcf91d67f90362ca
Packs can contain up to 2^32-1 objects, which exceeds the range of a
Java int. Try harder to accept higher object counts in some cases by
using long more often when we are working with the object count value.
This is a trivial refactoring, we may have to make even more changes
to the object handling code to support more than 2^31-1 objects.
Change-Id: I8cd8146e97cd1c738ad5b48fa9e33804982167e7
Annotated tags are relatively rare and currently are scheduled in a
pack file near the commits, decreasing the time it takes to resolve
client requests reading tags as part of a history traversal.
Putting them first before the commits allows the storage system to
page in the tag area, and have it relatively hot in the LRU when
the nearby commit area gets examined too. Later looking at the
tree and blob data will pollute the cache, making it more likely
the tags are not loaded and would require file IO.
Change-Id: I425f1f63ef937b8447c396939222ea20fdda290f
This counter always was running 1 higher, because it incremented
after the queue was exhausted (and every object was processed). Move
increments to be after the queue has provided a result, to ensure
we do not show a higher in-progress count than total count.
Change-Id: I97f815a0492c0957300475af409b6c6260008463
Make the code more clear with a simple refactoring of the boolean
logic into a method that describes the condition we are looking
for on each pack file. A cached pack is possible if there exists
a tips collection, and the collection is non-empty.
Change-Id: I4ac42b0622b39d159a0f4f223e291c35c71f672c
Double ' characters are needed for variables to appear in
single quotes. Variables surrounded with a s single ' will
not be replaced when formatted
Change-Id: I0182c1f679ba879ca19dd81bf46924f415dc6003
Signed-off-by: Kevin Sawicki <kevin@github.com>
As the method name and its javadoc clearly state that this method can
return null we can ignore this FindBugs warning.
Change-Id: I366435e26eda5d910f5d1a907db51f08efd4bb8c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Stored in a weak concurrent hash map, which we clean up while iterating.
Usually the weak reference behavior should not be necessary because
PackWriters should be released with release(), but we still want to
avoid leaks when dealing with broken client code.
Change-Id: I337abb952ac6524f7f920fedf04065edf84d01d2
Memory usage is dominated by three terms:
- The maximum memory allocated to each delta window.
- The maximum size of a single file held in memory during delta search.
- ObjectToPack instances owned by the writer.
For the first two terms, rather than doing complex instrumentation of
the DeltaWindows, we just overestimate based on the config parameters
(though we may underestimate if the maximum size is not set).
For the ObjectToPack instances, we do some rough byte accounting of the
underlying Java object representation.
Change-Id: I23fe3cf9d260a91f1aeb6ea22d75af8ddb9b1939
Exposes essentially the same state machine to the programmer as is
exposed to the client via a ProgressMonitor, using a wrapper around
beginTask()/endTask().
Change-Id: Ic3622b4acea65d2b9b3551c668806981fa7293e3
Decorators need to know whether folders in the working tree contain only
untracked files. This change enhances IndexDiffFilter to report such
folders. This works only together with treewalks which operate in
default traversal mode. For treewalks which process entries in
postorder mode (files are walked before their parent folder is walked)
this detection doesn't work.
Bug: 359264
Change-Id: I9298d1e3ccac0aec8bbd4e8ac867bc06a5c89c9f
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
Mark the treeWalk as recursive; otherwise following renames only works
for toplevel files.
Bug: 302549
Change-Id: I70867928eadf332b0942f8bf6877a3acb3828c87
Signed-off-by: Carsten Pfeiffer <carsten.pfeiffer@gebit.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
Refactored the three common transport configuration options:
credentials provider, timeout, and transport config callback
into a new TransportCommand base class which is now extended
by all commands that use a Transport object during execution.
Bug: 349188
Change-Id: I90c2c14fb4e3cc4712905158f9047153a0c235c2
Signed-off-by: Kevin Sawicki <kevin@github.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
A few places were still using GitIndex. Replacing it was fairly
simple, but there is a difference in test outcome in
ReadTreeTest.testUntrackedConflicts. I believe the new behavior
is good, since we do not update neither the index, not the worktree.
Change-Id: I4be5357b7b3139dded17f77e07a140addb213ea7
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
This includes merging ReadTreeTest into DirCacheCheckoutTest and
converting IndexDiffTest to use DirCache only. The GitIndex specific
T0007GitIndex test remains.
GitIndex is deprecated. Let us speed up its demise by focusing the
DirCacheCheckout tests to using DirCache instead.
This also add explicit deprecation comments to methods that depend
on GitIndex in Repository and TreeEntry. The latter is deprecated in
itself.
Change-Id: Id89262f7fbfee07871f444378f196ded444f2783
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Any RuntimeException or Error in this block will leave the lock
held by the caller thread, which can later result in deadlock or
just cache requests hanging forever because they cannot get to
the lock object.
Wrap everything in try/finally to prevent the lock from hanging,
even though a RuntimeException or Error should never happen in
any of these code paths.
Change-Id: Ibb3467f7ee4c06f617b737858b4be17b10d936e0
The cache starts with a single empty Ref that has no data, as the
clock list does not support being empty. When this Ref is removed,
the size has to be decremented from the associated DfsPackKey,
which was previously null. Make it always be non-null.
Change-Id: I2af99903e8039405ea6d67f383576ffa43839cff
* changes:
DfsBlockCache: Update hits to not include contains()
Add a listener for changes to a DfsObjDatabase's pack files
Expose the reverse index size in the DfsPackDescription
Add a DfsPackFile method to get the number of cached bytes
Expose the list of pack files in the DfsBlockCache
Add a DFS repository description and reference it in each pack
Clarify the docstring of DfsBlockCache.reconfigure()
DFS: A storage layer for JGit
Intended for cross-request use, so only refers to
DfsRepositoryDescriptions rather than DfsRepositorys.
Change-Id: I2633e472c9264d91d632069f608d53d4bdd0fc09
Callers may want to inspect the contents of the cache, which this allows
them to do in a read-only fashion without any locking.
Change-Id: Ifd78e8ce34e26e5cc33e9dd61d70c593ce479ee0
Just as DfsPackDescription describes a pack but does not imply it is
open in memory, a DfsRepositoryDescription describes a repository at a
basic level without it necessarily being open.
Change-Id: I890b5fccdda12c1090cfabf4083b5c0e98d717f6
The docstring was copied from the local filesystem cache code, which
actually attempted to reconfigure the cache on the fly. The DFS cache is
designed to be "reconfigured" exactly once.
Change-Id: Ia0b01f5d6b6b3d3a68d65a5c229ff67c1cede5bc
In practice the DHT storage layer has not been performing as well as
large scale server environments want to see from a Git server.
The performance of the DHT schema degrades rapidly as small changes
are pushed into the repository due to the chunk size being less than
1/3 of the pushed pack size. Small chunks cause poor prefetch
performance during reading, and require significantly longer prefetch
lists inside of the chunk meta field to work around the small size.
The DHT code is very complex (>17,000 lines of code) and is very
sensitive to the underlying database round-trip time, as well as the
way objects were written into the pack stream that was chunked and
stored on the database. A poor pack layout (from any version of C Git
prior to Junio reworking it) can cause the DHT code to be unable to
enumerate the objects of the linux-2.6 repository in a completable
time scale.
Performing a clone from a DHT stored repository of 2 million objects
takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row
for each object being cloned. This is very difficult for some DHTs to
scale, even at 5000 rows/second the lookup stage alone takes 6 minutes
(on local filesystem, this is almost too fast to bother measuring).
Some servers like Apache Cassandra just fall over and cannot complete
the 2 million lookups in rapid fire.
On a ~400 MiB repository, the DHT schema has an extra 25 MiB of
redundant data that gets downloaded to the JGit process, and that is
before you consider the cost of the OBJECT_INDEX table also being
fully loaded, which is at least 223 MiB of data for the linux kernel
repository. In the DHT schema answering a `git clone` of the ~400 MiB
linux kernel needs to load 248 MiB of "index" data from the DHT, in
addition to the ~400 MiB of pack data that gets sent to the client.
This is 193 MiB more data to be accessed than the native filesystem
format, but it needs to come over a much smaller pipe (local Ethernet
typically) than the local SATA disk drive.
I also never got around to writing the "repack" support for the DHT
schema, as it turns out to be fairly complex to safely repack data in
the repository while also trying to minimize the amount of changes
made to the database, due to very common limitations on database
mutation rates..
This new DFS storage layer fixes a lot of those issues by taking the
simple approach for storing relatively standard Git pack and index
files on an abstract filesystem. Packs are accessed by an in-process
buffer cache, similar to the WindowCache used by the local filesystem
storage layer. Unlike the local file IO, there are some assumptions
that the storage system has relatively high latency and no concept of
"file handles". Instead it looks at the file more like HTTP byte range
requests, where a read channel is a simply a thunk to trigger a read
request over the network.
The DFS code in this change is still abstract, it does not store on
any particular filesystem, but is fairly well suited to the Amazon S3
or Apache Hadoop HDFS. Storing packs directly on HDFS rather than
HBase removes a layer of abstraction, as most HBase row reads turn
into an HDFS read.
Most of the DFS code in this change was blatently copied from the
local filesystem code. Most parts should be refactored to be shared
between the two storage systems, but right now I am hesistent to do
this due to how well tuned the local filesystem code currently is.
Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
Actually this is not ok according to the RFC, but this implementation is
ment to be Git compatible. A '\' is needed when the authentication
requires or allows authentication to a Windows domain where the
user name can be specified as DOMAIN\user.
Change-Id: If02f258c032486f1afd2e09592a3c7069942eb8b
Open a repository for submodule entries that have a child .git
directory and use the resolved HEAD commit as the entry's id.
Change-Id: I68d6e127f018b24ee865865a2dd3011a0e21453c
Signed-off-by: Kevin Sawicki <kevin@github.com>
The system property jgit.cygpath must be set to true in order
for cygwin's cygpath to be used to translate path from cygwin
namespace to Windows namespace.
The cygwin path translation should be considered deprecated.
Bug: 353389
Change-Id: I2b5234c0ab936dac67d1e232f4cd28331bf3226d
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
* changes:
Cosmetic adjustment of relative date format, do not display "0 months"
Make use of the many date formatting options in the log command
Define a utility class for handling Git date formats
Though it may seem less precise, "0 months" looks bad and the reference
Git implementation also does not display "0 months"
Change-Id: I488e9c97656f9941788ae88d7c5c1562ab6c26f0
The egit history view shows the files associated with a commit by using
a PathFilter. When following renames with a FollowFilter, the PathFilter
cannot be configured anymore because the affected files are simply not
known.
Thus, it should be possible to get to know which files are renamed.
Bug: 302549
Change-Id: I4761e9f5cfb4f0ef0b0e1e38991401a1d5003bea
Introducing a new abstract method is not nice when one
expects other to subclass them. Create default implementations
so old code that implements SystemReader does not break.
The default methods just delegate to the JVM.
Change-Id: I42cdfdcb6b29f7203697a23833dca85185b0b9b3
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Besides the formats known by git-log(1) we also add "locale"
and "localelocal" that formats dates according to the user's locale.
"locale" does not translate into local timezone, while
localelocal does.
Change-Id: I1c088dcec992c107e43f6c17be4ac9ed6eb428bf
We deleted the entry if there was a file and an index
entry, but not when there was just an index entry. Now
delete the file in both cases since the missing file
just means our worktree is dirty. This affected the
implementation of reset --hard.
Bug: 347574
Change-Id: Ie66fa61303472422830f5e33614e93ad65094e5d
* changes:
UploadPack: Fix races in smart HTTP negotiation
PackWriter: Export more statistics
Do not requeue state vector in stateless RPC fetch
Wrap excessively long line in BasePackFetchConnection
Fix smart HTTP client stream alignment errors
IndexDiff was extended to calculate ignored files and folders.
The calculation only considers files that are NOT in the index.
This functionality is required by the new EGit decorator implementation.
Bug: 359264
Change-Id: I8f09d6a4d61b64aeea80fd22bf3a2963c2bca347
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
This allows the following usage pattern:
PathFilterGroup.createFromStrings("path1", "path2");
Change-Id: I589e758cc55873ce75614602e017ac793435e24d
Signed-off-by: Kevin Sawicki <kevin@github.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
This constant determine the default start-point, if the user
don't want to create a branch from the current HEAD.
Change-Id: Iea944e11e80134fbafc4c47383457d5ed11a4164
Signed-off-by: Manuel Doninger <manuel.doninger@googlemail.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Since we replaced GitIndex by DirCache JGit didn't fire
IndexChangedEvents anymore. For EGit this still worked with a high
latency since its RepositoryChangeScanner which is scheduled to
run each 10 seconds fires the event in case the index changes.
This scanner is meant to detect index changes induced by a different
process e.g. by calling "git add" from native git.
When the index is changed from within the same process we should fire
the event synchronously. Compare the index checksum on write to index
checksum when index was read earlier to determine if index really
changed. Use IndexChangedListener interface to keep DirCache decoupled
from Repository.
Change-Id: Id4311f7a7859ffe8738863b3d86c83c8b5f513af
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The checkout command was producing an inconsistent state of the index
which even confuses native git. The content sha1 of the touched index
entries was updated, but the length and the filemode was not updated.
Later in coding the index entries got automatically corrected (through
Dircache.checkoutEntry()) but the correction was after persisting the
index to disk. So, the correction was lost and we ended up with an index
where length and sha1 don't fit together.
A similar problem is fixed with "lastModified" of DircacheEntry. When
checking out a path without specifying an explicit commit (you want to
checkout what's in the index) the index was not updated regarding
lastModified. Readers of the index will think the checked-out
file is dirty because the file has a younger lastmodified then what's
in the index.
Change-Id: Ifc6d806fbf96f53c94d9ded0befcc932d943aa04
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Bug: 355205
Change-Id: Ia0e73208b86c45a3d96698e973f6e70ec5cb7303
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
We should see whether the commit was a regular commit or something
else.
Change-Id: I82d8300cf3c53cb2bdcb6495386aadb803e0c6f7
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add a TransportConfigCallback parameter to JGit API commands, to allow
consumers of the JGit command API to perform custom Transport configuration
that would be otherwise difficult to anticipate & expose on the API command
builders.
My specific use-case is configuring additional properties on SshTransport
- I need to take over the SshSessionFactory used by the transport. Using
TransportConfigCallback I can simply do this (rather than reimplement the
API command classes):
public void configure(Transport tn) {
if (tn instanceof SshTransport) {
((SshTransport) tn).setSshSessionFactory(factoryProvider.get());
}
}
Adding an explicit setSshSessionFactory() method to the JGit command
classes would bloat the API. Also, creating the replacement
SshSessionFactory is unnecessary if the transport is not SSH, but the type
of the Transport is only known once the remote has been resolved and the
URI parsed - consequently it makes sense to perform this step in a
callback, where the transport instance can be inspected to determine if
it's of a relevant type.
A note about where this leaves the API - there are now 4 commands:
CloneCommand
PullCommand
FetchCommand
PushCommand
-that share 3 identical transport-related parameters:
timeout
credentialsProvider
transportConfigurator
I think there's potential for introducing an interface or val-object to
identify/encapsulate this repetition, which I'd be happy to do in a
subsequent commit.
Change-Id: I8983c3627cdd7d7b2aeb0b6a3dadee553378b951
Signed-off-by: Roberto Tyley <roberto.tyley@gmail.com>
We can detect index changes using FileSnapshot. This is more efficient
and removes usage of a deprecated class.
Change-Id: I4a679102c9a1bd8e82b9ca93eb9dbbde445e9be4
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Clients cache the set of advertised references at the start of a
negotiation, and keep replaying the same "want SHA1" list to the
server on each negotiation step. If another client pushes into
a branch and moves it by fast-forward, any request to obtain that
branch's prior SHA-1 is still valid, the commit is reachable from
the new position of the reference. Unfortunately the fast-forward
causes smart HTTP negotations to fail, as the server no longer is
advertising that prior SHA-1.
Instead of causing clients to fail out with a "want invalid" error
and forcing the end-user retry, possibly getting into a never ending
try-fail-retry race while other clients are pushing into the same
busy repository, allow the slightly stale want request so long as
it is still reachable.
C Git implemented this same change recently to fix races on the
smart HTTP protocol when the C Git git-http-backend is used.
The new RequestPolicy feature also allows server authors to make
an even more lenient configuration that exports any SHA-1 to the
client. This might be useful in certain settings where a server
has authenticated the client as the "repository owner" and wants
to allow them to grab any content from the server as a complete
unbroken history chain.
The new setAdvertisedRefs() method allows server authors to manually
fix the references that are advertised, possibly bypassing the
getAllRefs() call on the Repository object.
Change-Id: I7cdb563bf9c55c83653f217f6e53c3add55a0541
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Export the shallow pack information, and also a handy function to
sum up the total times. Include the time writing out the index file,
if it was created.
Change-Id: I7f60ae6848455a357b25feedb23743bbf6c153cf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the no-done capability was enabled on the connection, don't
queue up the state vector again once the ACK %s ready message
is observed from the remote. The pack will be following in this
response stream, so the state vector is no longer required.
Change-Id: I7bd1e76957cb58c7ff1cdaeef227f1b02a7e5d24
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The client's use of UnionInputStream was broken when combined with a
8192 byte buffer used by PackParser. A smart HTTP client connection
always pushes in the execute stateless RPC input stream after the
data stream has ended from the remote peer. At the end of the pack,
PackParser asked to fill a 8192 byte buffer, but if only e.g. 1000
bytes remained UnionInputStream went to the next stream and asked
it for input, which triggered a new RPC, and failed because there
was nothing pending in the request buffer.
Change UnionInputStream to only return what it consumed from a
single InputStream without invoking the next InputStream, just in
case that second InputStream happens to be one of these magical
ones that generates an RPC invocation.
Change-Id: I0e51a8e6fea1647e4d2e08ac9cfc69c2945ce4cb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Calls to unlock the DirCache before throwing an exception
were not needed since checkout calls doCheckout wrapped
in a try block that calls DirCache.unlock in a finally
block.
Change-Id: I2b249a784f9e363430e288aad67fcefb7fac0a6e
Signed-off-by: Kevin Sawicki <kevin@github.com>
* stable-1.1:
Allow commit when submodule changes are present
Ignore submodule on checkout instead of deleting it
cleanup: Reuse local variable for current DirCacheEntry
Prepare post v1.1.0.201109071825-rc3 builds
JGit v1.1.0.201109071825-rc3
Use commit message best practices for Mylyn Commit template
Change-Id: I6ab9e5cb48c036d2ee2e548f5ec040d93672d8ad
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
We do not yet check or validate submodules, but can accept that
someone staged a change in a submodule with other tools.
Change-Id: I642ede382314bfbd1892dd509a2222885cc5350a
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
The purpose of this commit is to prevent destruction of
submodules on checkout from a tree with a submodule to
another. For consistency we handle the reverse case too,
when we checkout a branch that has a submodule and the
submodule directory exists. And finally we ignore the
case where the submodule changes.
We do not update the submodules, we just try to ignore
them harder.
Bug: 356664
Change-Id: I202c695a57af99b13d0d7220803fd08def3d9b5e
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Since we already have assigned i.getDirCacheEntry() to dce,
use dce instead.
Change-Id: I107713ad0b356516d75c29203f945b056bad3ac7
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
IndexOutOfBoundException is thrown from Repository.resolveSimple() when
'-g' string is located less then 4 characters from the end of this
string.
Change-Id: I1128c2cdfec9db3023d4d0f1f40d863e84b75950
Signed-off-by: Dariusz Luksza <dariusz@luksza.org>
We should use a template for Mylyn commit messages that matches with our
guidelines for commit messages.
http://wiki.eclipse.org/EGit/Contributor_Guide#Commit_message_guidelines
Bug: 337401
Change-Id: I05812abf0eb0651d22c439142640f173fc2f2ba0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
We were diverging from the reference implementation. Always use the
ref we checkout to as the to-branch the reflog and avoid the
refs/heads both in the from-name and to-name.
Change-Id: Id973d9102593872e4df41d0788f0eb7c7fd130c4
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I91c7e08c4afd2562df2226887a933d93c78a0371
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
JGit currently identifies loose objects as 'corrupt' if they've been
deflated using a window size less than 32Kb, because the
isStandardFormat() function doesn't recognise the header
byte as a zlib header. This patch makes the method tolerant of
all valid window sizes (15-bit to 8-bit) - but doesn't sacrifice
it's accuracy in distingushing the standard loose-object format
from the experimental (now abandoned) format. It's based on a patch
which has been merged into C-Git master branch:
https://github.com/git/git/commit/7f684a2aff636f44a506
On memory constrained systems zlib may use a much smaller window
size - working on Agit, I found that Android uses a 4KB window;
giving a header byte of 0x48, not 0x78. Consequently all loose
objects generated by the Android platform appear 'corrupt' :(
It might appear that this patch changes isStandardFormat() to the
point where it could incorrectly identify the experimental format as
the standard one, but the two criteria (bitmask & checksum) can only
give a false result for an experimental object where both of the
following are true:
1) object size is exactly 8 bytes when uncompressed (bitmask)
2) [single-byte in-pack git type&size header] * 256
+ [1st byte of the following zlib header] % 31 = 0 (checksum)
As it happens, for all possible combinations of valid object type
(1-4) and window bits (0-7), the only time when the checksum will be
divisible by 31 is for 0x1838 - ie object type *1*, a Commit - which,
due the fields all Commit objects must contain, could never be as
small as 8 bytes in size.
Given this, the combination of the two criteria (bitmask & checksum)
always correctly determines the buffer format, and is more tolerant
than the previous version.
References:
Android uses a 4KB window for deflation:
http://android.git.kernel.org/?p=platform/libcore.git;a=blob;f=luni/src/main/native/java_util_zip_Deflater.cpp;h=c0b2feff196e63a7b85d97cf9ae5bb2583409c28;hb=refs/heads/gingerbread#l53
Code snippet searching for false positives with the zlib checksum:
https://gist.github.com/1118177
Change-Id: Ifd84cd2bd6b46f087c9984fb4cbd8309f483dec0
Create separate loops based on whether the ref specs
collection is empty or not.
Change-Id: I2861b45fe083675e12ff47f591e104f8cf6dd216
Signed-off-by: Kevin Sawicki <kevin@github.com>
If the ResetCommand should reset to a invalid ref (e.g. HEAD in a repo
whithout a single commit) it was throwing an NPE. This is fixed now by
throwing a JGitInternalExcpeption. It would be nicer if we could throw
a InvalidRefException, but this would modify our API.
Bug: 339610
Change-Id: Iffcb4f2cca9f702176471d93c3a71e5cb3e700b1
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
This implements the server side of shallow clones only (i.e.
git-upload-pack), not the client side.
CQ: 5517
Bug: 301627
Change-Id: Ied5f501f9c8d1fe90ab2ba44fac5fa67ed0035a4
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
This can be useful when implementing garbage collection and there
are packs that should not be copied, such as huge packs that have
a sibling ".keep" file alongside of them.
Callers driving PackWriter need to initialize the list of packs not
to include objects from by passing each index to excludeObjects().
Change-Id: Id7f34df69df97be406bcae184308e92b0e8690fd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Test was added which reproduce the ClassCastException when ours or
theirs merge strategy is set to MergeCommand. Merger and MergeCommand
were updated in order to avoid exception.
Change-Id: I4c1284b4e80d82638d0677a05e5d38182526d196
Signed-off-by: Denys Digtiar <duemir@gmail.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Adds method into DiffEntry class that allows to specify whether changed
trees are included in scanning result list. By default changed trees
aren't added, but in some cases having changed tree would be useful.
Also adds check for tree count in TreeWalk and when it is different from
two it will thrown an IllegalArgumentException.
This change is required by egit
I7ddb21e7ff54333dd6d7ace3209bbcf83da2b219
Change-Id: I5a680a73e1cffa18ade3402cc86008f46c1da1f1
Signed-off-by: Dariusz Luksza <dariusz@luksza.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
During parsing these are used with contains(). If they are a List
type, the contains operation is not efficient. Some callers such
as UploadPack often pass a List here, so convert to Set when the
type isn't efficient for contains().
Change-Id: If948ae3bf1f46e756bd2d5db14795e12ba7a6207
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This reverts commit 67b064fc9f.
The "tiny optimization" introduced by 67b0 turns out to have a big
savings on wall-clock time when the object store is very slow (e.g.
the DHT support in JGit), but comes with a much bigger penalty in
space used by the output stream. CGit packed with 67b0 enabled is
7 MiB larger than it should be (36 MiB rather than 28/29 MiB). The
much bigger Linux kernel repository gained over 200 MiB, though some
of this may have been caused by a smaller window setting.
Revert this patch as PackWriter should be optimizing for space used
rather than time spent, since its primary use is network transfer, and
that isn't free.
Change-Id: I7413a9ef89762208159b4a1adc5a22a4c9245611
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The "Counting objects" phase of packing is the most time consuming
part for any server providing access to Git repositories. Scanning
through the entire project history, including every revision of
every tree that has ever existed is expensive and takes an incredible
amount of CPU time.
Inline the tree parsing logic, unroll a number of loops, and setup
to better handle the common case of seeing another occurrence of
an object that was already marked SEEN.
This change boosts the "Counting objects" phase when JGit is acting
as a server and is packing the linux-2.6 repository for its client.
Compared to CGit on the same hardware, a JGit daemon server is now
21883 objects/sec faster:
CGit:
Counted 2058062 objects in 38981 ms at 52796.54 objects/sec
Counted 2058062 objects in 38920 ms at 52879.29 objects/sec
Counted 2058062 objects in 39059 ms at 52691.11 objects/sec
JGit (before):
Counted 2058062 objects in 31529 ms at 65275.21 objects/sec
Counted 2058062 objects in 30359 ms at 67790.84 objects/sec
Counted 2058062 objects in 30033 ms at 68526.69 objects/sec
JGit (this commit):
Counted 2058062 objects in 28726 ms at 71644.57 objects/sec
Counted 2058062 objects in 27652 ms at 74427.24 objects/sec
Counted 2058062 objects in 27528 ms at 74762.50 objects/sec
Above the first run was a "cold server". For JGit the JVM had just
started up with `jgit daemon`, and for CGit we hadn't touched the
repository "recently" (but it was certainly in kernel buffer cache).
The second and third runs were against the running JGit JVM, allowing
timing tests to better reflect the benefits of JGit's pack and index
caching, as well as any optimizations the JIT may have performed.
The timings are fair. CGit is opening, checking and mmap'ing both
the pack and index during the timer. JGit is opening, checking
and malloc+read'ing the pack and index data into its Java heap
during the timer. Both processes are walking the same graph space,
and are computing the "path hash" necessary to sort objects in the
object table for delta compression. Since this commit only impacts
the "Counting objects" phase, delta compression was obviously not
included in the timings and JGit may still be performing delta
compression slower than CGit, resulting in an overall slower server
experience for clients.
Change-Id: Ieb184bfaed8475d6960a494b1f3c870e0382164a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is useful when the result needs to be displayed and it's only of
interest if the operation was successful or not (in egit, it could be
used in MultiPullResultDialog).
Change-Id: Icfc9a9c76763f8a777087a1262c8d6ad251a9068
Signed-off-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The offset32 format is used for objects <= 2^31-1, while the offset64
format is used for all other objects. This condition was missing
the = needed to ensure an object placed exactly at 2^31 would have
its 64 bit offset in the index.
Change-Id: I293fac0e829c9baa12cb59411dffde666051d6c5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A non-thin pack does not need to worry about preferred bases, the pack
will be self-contained and all required delta base objects will appear
within the pack itself. Obtaining the path buffer and length from the
ObjectWalk to build the preferred base table is "expensive", so avoid
the cost unless a thin pack is being constructed.
Change-Id: I16e30cd864f4189d4304e7957a7cd5bdb9e84528
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* changes:
PackWriter: Skip progress messages on fast operations
IndexPack: Defer the "Resolving deltas" progress meter
IndexPack: Fix "Resolving deltas" progress meter
The previous comment stated that the value set was used
to keep track of the branch in the remote repository
which was incorrect.
Updated the method comment to match the format used
for the PushCommand.setRemote and FetchCommand.setRemote
methods.
Change-Id: I11b81eb3125958af29247b485da56fd88c3bfdf5
Signed-off-by: Kevin Sawicki <kevin@github.com>
If the "Finding sources" phase will complete in <1 second with no
delta compression enabled, don't bother showing the progress meter for
this phase. Small repositories on the local filesystem tend to rip
through this phase always subsecond and the ProgressMonitor display
can actually slow the operation down.
If delta compression is enabled, there are two phases that may run
very quickly. Set the timer to 500 milliseconds instead, reducing the
risk that the user has to wait longer than 1 second before any sort of
output from the packer occurs.
Change-Id: I58110f17e2a5ffa0134f9768b94804d16bbb8399
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If delta resolution completes in < 1000 milliseconds, don't bother
showing the progress meter. This is actually very common for a Gerrit
Code Review server, where the client is probably sending 1 commit and
only a few trees/blobs modified... and the base objects are hot in the
process buffer cache.
The 1000 millisecond delay is just a guess at a reasonable time to wait.
Change-Id: I440baa64ab0dfa21be61deae8dcd3ca061bed8ce
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This progress meter never reached 100% as it did not update while
resolving the external bases in thin packs.
Instead of updating in batches at the top level, update once per delta
that is resolved. The batching progress meter type should smooth out
the frequent updates to an update rate that is more reasonable to send
to the UI, while also ensuring a successful pack parse always reaches
100% deltas resolved.
Change-Id: Ic77dcac542cfa97213a6b0194708f9d3c256d223
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Repository inspection tools may find building a reverse index on a
pack useful, as they can then locate an object by offset. As both
C Git and JGit sometimes produce error messages with the offset
rather than the SHA-1, it may be useful to expose this type.
Change-Id: I487bf32e85a8985cf8ab382d4c82fcbe1fc7da6c
There isn't a good reason to hide all of these as package-private.
Make them public so applications can inspect pack files.
Change-Id: Ia418daf65d63e9e015b8dafdf3d06a1ed91d190b
Some embeddings of UploadPack (e.g. Gerrit Code Review) set their own
PackConfig from a server-wide configuration, overriding any JGit
defaults or settings that may exist at the local repository level.
Make a copy constructor form of PackConfig so this server-wide
configuration object can be copied and then merged with repository
specific configuration data.
Change-Id: I4463c95aeaf7d6536c3ab132dec9c50ee528d9e0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The cached object databases should not require a close to release
their cached resources. Most object databases just return their
own reference for newCachedDatabase(), so a close() here kills
the real database's internal caches, and possibly underlying files,
resulting in poor performance for the callers of PackParser like
ReceivePack or FetchProcess trying to then go look up objects that
were just parsed, or that current references point to.
Change-Id: Ia4a239093866e5b9faf82744f729fb73f4373f1a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A set of ref names like ('a/b' and 'a+b') would cause the RefDirectory
to think that the set of refs have changed because it traversed the
'a' directory in the subtree before looking at 'a+b', but it then
compared with the know refs which are sorted with 'a+b' first.
Fix this by traversing the refs tree in another order. Treat a directory
as if they ends with a '/' before deciding on the order to traverse
the refs tree.
Bug: 348834
Change-Id: I23377f8df00c7252bf27dbcfba5da193c5403917
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Repository.writeMergeCommitMsg(null) no longer fails if the MERGE_MSG
file is missing. This was done to avoid CommitCommand to fail in case of
a missing MERGE_MSG file.
Bug: 352243
Change-Id: Iddf43533d133f8f22199ed6e2393a552670e7d1f
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Reset command should works recursively and allows reset all changed
files in given directory.
Bug: 348524
Change-Id: I441db34f226be36548c61cef77958995971498de
Signed-off-by: Dariusz Luksza <dariusz@luksza.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Creation of a branch X from an annotated tag, as the starting point,
resulted into .git/refs/heads/X containing the ID of the annotated tag
instead of the ID of the tagged commit.
This fix peels the tag ref before using it as the starting point for
the newly created branch.
Bug: 340836
Change-Id: I01c7325770ecb37f5bf8ddb2a22f802466524f24
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
When trying to clone into a folder that already contains a cloned
repository native git will fail with a message "fatal: destination path
'folder' already exists and is not an empty directory.". Now JGit will
also fail in this situation throwing a JGitInternalException.
The test case was provided by Tomasz Zarna.
Bug: 347852
Change-Id: If9e9919a5f92d13cf038dc470c21ee5967322dac
Also-by: Tomasz Zarna <Tomasz.Zarna@pl.ibm.com>
Signed-off-by: Adrian Goerler <adrian.goerler@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If no refSpec is explicitly set, the PushCommand should first check the
remote config and then as a fallback use the current behavior.
Change-Id: I2bc648abc517b1d01b2de15d383423ace2081e72
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
DirCacheCheckout did not unlock the index if e.g. an IOException occured
during checkout.
Bug: 350677
Change-Id: Ie9fa09f7a404080da7cdccafb9be3a8c845e4869
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
ObjectDirectoryInserter was always creating a temporary file,
writing the complete compressed contents of a tree, fsync()'ing
that to stable storage, and only then checking to see if there
was already an object with the same SHA-1 in the repository.
For commits this strategy makes some sense, the commit is very
unlikely to exist in the repository, as there are embedded times
and these change with each commit.
However for trees coming out of DirCache, it is more common for the
tree to already exist in the repository. Most subdirectories are
not modified in any given commit. Doing all of this local file IO
for things that already exist is very slow.
Try to detect cases where the object is "small enough" that it can
be processed entirely in memory, and avoid doing disk IO entirely
if the object already exists.
Also increase the size of the output buffer for the deflation.
This should boost the average write(2) syscall size from 512 bytes
to 8192 bytes, making streaming of large compressed contents to
disk slightly more efficient.
Change-Id: I1d40364e8725468522435814631916d73174c92b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
I had the conditions wrong here, causing the in-memory InputStream
to always appear to be at EOF.
Change-Id: I6811d6187a34eaf1fd6c5002550d631decdfc391
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Adds a git-reflog command and associated tests.
Bug: 347859
Change-Id: Iba146ac842cc9ca0be43d3381b4082c9e92bf56f
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
It's useful to have ReflogEntry refactored out so it can be
used by clients via the JGit API.
Change-Id: I03044df9af9f9547777545b7c9b93bdf5f8b7cb5
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
If an internal exception occurs while packing and the request
needs to abort, the HTTP response might already be committed due
to progress message having already been delivered to the client.
This prevents UploadPackServlet from resetting the response and
sending back an HTTP 500 response.
Try to catch all exceptions and report internal errors over the
sideband stream or as an ERR command during the initial ACK/NAK
negotiation phase. This allows JGit to transmit an error message
that the user will receive on their console without needing to
worry about resetting the (already gone) HTTP response.
Change-Id: Ie393fb8bb55d2b79ab1276adf71c781c1807f9fe
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If a fetch or push needs to apply more than a few references
to the local repository it may take more than 0.25 seconds to
process all of the updates. This is especially true in the DHT
storage system during an initial push of a project with many tags.
The backend database may need to use a transaction to ensure each
tag reference creation is unique, and there may be large delays
caused by these transactions.
Change-Id: Ib11a077adfbd525253e425d327f2e2c2380804c7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* stable-1.0:
Prepare post JGit v1.0.0.201106090707-r builds
JGit v1.0.0.201106090707-r
Include about.html files in maven build
Prepare post v1.0.0.201106081625-r builds
JGit v1.0.0.201106081625-r
Add missing about.html files to all shipped bundles
Prepare post v1.0.0.201106071701-r builds
JGit v1.0.0.201106071701-r
* stable-1.0:
Prepare post v1.0.0.201106011211-rc3 builds
JGit v1.0.0.201106011211-rc3
Remove incubation marker
blame: Compute the origin of lines in a result file
BlameGenerator digs through history and discovers the origin of each
line of some result file. BlameResult consumes the stream of regions
created by the generator and lays them out in a table for applications
to display alongside of source lines.
Applications may optionally push in the working tree copy of a file
using the push(String, byte[]) method, allowing the application to
receive accurate line annotations for the working tree version. Lines
that are uncommitted (difference between HEAD and working tree) will
show up with the description given by the application as the author,
or "Not Committed Yet" as a default string.
Applications may also run the BlameGenerator in reverse mode using the
reverse(AnyObjectId, AnyObjectId) method instead of push(). When
running in the reverse mode the generator annotates lines by the
commit they are removed in, rather than the commit they were added in.
This allows a user to discover where a line disappeared from when they
are looking at an older revision in the repository. For example:
blame --reverse 16e810b2..master -L 1080, org.eclipse.jgit.test/tst/org/eclipse/jgit/storage/file/RefDirectoryTest.java
( 1080) }
2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1081)
2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1082) /**
2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1083) * Kick the timestamp of a local file.
Above we learn that line 1080 (a closing curly brace of the prior
method) still exists in branch master, but the Javadoc comment below
it has been removed by Christian Halstrick on May 20th as part of
commit 2302a6d3. This result differs considerably from that of C
Git's blame --reverse feature. JGit tells the reader which commit
performed the delete, while C Git tells the reader the last commit
that still contained the line, leaving it an exercise to the reader
to discover the descendant that performed the removal.
This is still only a basic implementation. Quite notably it is
missing support for the smart block copy/move detection that the C
implementation of `git blame` is well known for. Despite being
incremental, the BlameGenerator can only be run once. After the
generator runs it cannot be reused. A better implementation would
support applications browsing through history efficiently.
In regards to CQ 5110, only a little of the original code survives.
CQ: 5110
Bug: 306161
Change-Id: I84b8ea4838bb7d25f4fcdd540547884704661b8f
Signed-off-by: Kevin Sawicki <kevin@github.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
* stable-1.0:
DHT: Support removing a repository name
DHT: Fix thread-safety issue in AbstractWriteBuffer
jgit.sh: Implement pager support
Change EditList to extend ArrayList
Ensure the HTTP request is fully consumed
Make sure test repositories are closed
Fix CloneCommand not to fetch into remote tracking branches when bare
Update Eclipse IP log for 1.0
Change-Id: I6340d551482e1dda01f82496296d2038b07fa68b
There is no reason for this type to contain an ArrayList and try to
hide the implementation. It only slows down execution by adding an
extra layer of method dispatch to each invocation.
Instead subclass from ArrayList.
Change-Id: Ifbb9c7060c2fe3d5a7397c1aa85fbade14088637
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some servlet containers require the servlet to read the EOF marker
from the input stream before a response can be output if the stream
is using "Transfer-Encoding: chunked"... which is typical for any
sort of large push to a repository over smart HTTP.
Ensure the EOF is always read by the PackParser when it is handling
the stream, and fail fast if there is more data present than expected
since this does indicate a protocol error.
Also ensure the EOF is read by UploadPack before it starts to output
a partial response using packing progress meters.
Change-Id: I131db9dea20b2324cb7c3272a814f21296bc64bd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When cloning into a bare repository we should not create remote
tracking branches (e.g refs/remotes/origin/testX). Branches of the
remote repository should but fetched into into branches of the same
name (e.g refs/heads/testX). Also add the noCheckout option which
would prevent checkout after fetch.
Change-Id: I5d4cc0389f3f30c53aa0065f38119af2a1430909
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
RefDirectory was not using FileSnapshot correctly in all places. This
is fixed with this commit. Additionally the constructors for the
different types of refs have been changed to take a FileSnapshot
instead of a modification time.
Change-Id: Ifb6a59e87e8b058a398c38cdfb9d648f0bad4bf8
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
RefData now uses a sequence number as part of the field, ensuring
that updates always increase the sequence number by one whenever
a reference is modified.
Attaching a sequence number to RefData will help with storing
reference log entries during updates. As the sequence number should
be unique within the reference name space, log entries can be keyed
by the sequence number and remain unique. Making this work over
reference delete-create cycles will require an additional RefTable
API to return the oldest sequence number previously used in the
reference log to seed the recreated reference.
Change-Id: I11cfff2a96ef962e57f29925a3eef41bdbf9f9bb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Adds a class which can be used to calculates a SHA1 of the diff
associated with a patch, similar to git patch-id.
In this version whitespace is not ignored.
Change-Id: I421d15ea905e23df543082786786841cbe3ef10d
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
EGit change Iba3b87293c22e5fe7d989fc312184aa7463c4387 is also required
to make this work for EGit.
Change-Id: Iedc80e133e66d72e78ff0980b6e12634f75eca36
Signed-off-by: Carsten Pfeiffer <carsten.pfeiffer@gebit.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Since this change may affect performance and memory consumption on every
access to a loose ref I explicitly made it a RFC to collect opinions.
Previously RefDirectory.scanRef() was not detecting an update of a
loose ref when the update didn't changed the modification time of
the backing file. RefDirectory cached loose refs and the way to detect
outdated cache entries was to compare lastmodification timestamp on the
file representing the ref. If two updates to the same ref happen faster
than the filesystem-timer granularity (for linux this is 2 seconds)
there is the possiblity that we don't detect the update.
Because of this bug EGit's PushOperationTest only works with 2 second
sleeps inside.
This change let RefDirectory use FileSnapshot to detect such situations.
FileSnapshot helps to remember when a file was last read from disk and
therefore enables to decide when to load a file from disk although
modification time has not changed.
Change-Id: I03b9a137af097ec69c4c5e2eaa512d2bdd7fe080
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
A checkout conflict during rebase setup should leave the repository
in SAFE state which means ensuring that the rebase temporary files
need to be removed.
Bug: 346813
Change-Id: If8b758fde73ed5a452a99a195a844825a03bae1a
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Change Ia2ab4f8dc95020f2914ff01c2bf3b1bc62a9d45d added merge
support for when OURS or THEIRS was simultaneously deleted
and modified. That changeset however did not add create an
entry in the conflicts table so clients would see a CONFLICTING
result but getConflicts() would return null.
This change creates a MergeResult for the conflicting file.
Bug: 345684
Change-Id: I52acb81c1729b49c9fb3e7a477c6448d8e55c317
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Change Ib9898fe0f982fa08e41f1dca9452c43de715fdb6 added support for
the 'cherry-pick' fast forward case where the upstream commit history
does not include any merge commits. This change adds support for the
case where merge commits exist and the local branch has no changes.
Bug: 344779
Change-Id: If203ce5aa1b4e5d4d7982deb621b710e71f4ee10
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Previously when merging two contents with a non-empty base and one of
the contents was empty (size == 0) and the other was modified there
was a potentially expensive calculation until we finally always come
to the same result -> the complete non-deleted content should collide
with the empty content.
This proposal adds an optimization to detect empty input content and
to produce the appropriate result immediatly.
Change-Id: Ie6a837260c19d808f0e99173f570ff96dd22acd3
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
For the following patch on the linux 2.6.32 tag:
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -685,6 +685,7 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sc
static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
+#if 0
#ifdef CONFIG_SCHED_DEBUG
s64 d = se->vruntime - cfs_rq->min_vruntime;
@@ -694,6 +695,7 @@ static void check_spread(struct cfs_rq *cfs_rq, struct
sched
if (d > 3*sysctl_sched_latency)
schedstat_inc(cfs_rq, nr_spread_over);
#endif
+#endif
}
static void
JGit produced an incorrect diff, attempting to add a new "}" instead
of the new "#endif" at the end of the hunk. This was caused by a prior
fix for bug 328895 where we wanted to "slide" a diff down in the file
when adding a new method/function and want to show the closing curly
brace as being added after the new method, rather than added onto the
end of the prior function or method just before the insertion point.
Bug: 345956
Change-Id: I32b9e24f1e2980258b1b39dd1807919ab1c5f9b2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The problem occurred when the first text ends in the middle
of the last line of the other text and the first text has no
end of line.
Bug: 344975
Change-Id: I1f0dd9f8062f2148a7c1341c9122202e082ad19d
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
This makes the message look the same as in C Git (the "."):
This reverts commit <sha1>.
Change-Id: I4c254c122277b127e7b039c0d1c7f7a0d691530d
Signed-off-by: Robin Stocker <robin@nibor.org>
Since d1718a the method getHumanishName was broken on windows since
the URIish is not normalized anymore. For a path like
"C:\gitRepositories\egit" the whole path was returned instead of
"egit".
Bug: 343519
Change-Id: I95056009072b99d32f288966302d0f8188b47836
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Before this change any files in the conflicting set would
also be listed in the the other IndexDiff Sets which is
confusing. With this change a conflicting file will not
be included in any of the other sets.
Change-Id: Ife9f2652685220bcfddc1f9820423acdcd5acfdc
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Submodules are supposed to be handled by separate operations, so
we should ignore them on checkout, just like C Git does.
This fix does not add submodule support. We just try harder
to ignore them.
Bug: 343566
Change-Id: I2c5ae1024ea7bb57adf27072da6acc9643018eda
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
jgit.storage.dht is a storage provider implementation for JGit that
permits storing the Git repository in a distributed hashtable, NoSQL
system, or other database. The actual underlying storage system is
undefined, and can be plugged in by implementing 7 small interfaces:
* Database
* RepositoryIndexTable
* RepositoryTable
* RefTable
* ChunkTable
* ObjectIndexTable
* WriteBuffer
The storage provider interface tries to assume very little about the
underlying storage system, and requires only three key features:
* key -> value lookup (a hashtable is suitable)
* atomic updates on single rows
* asynchronous operations (Java's ExecutorService is easy to use)
Most NoSQL database products offer all 3 of these features in their
clients, and so does any decent network based cache system like the
open source memcache product. Relying only on key equality for data
retrevial makes it simple for the storage engine to distribute across
multiple machines. Traditional SQL systems could also be used with a
JDBC based spi implementation.
Before submitting this change I have implemented six storage systems
for the spi layer:
* Apache HBase[1]
* Apache Cassandra[2]
* Google Bigtable[3]
* an in-memory implementation for unit testing
* a JDBC implementation for SQL
* a generic cache provider that can ride on top of memcache
All six systems came in with an spi layer around 1000 lines of code to
implement the above 7 interfaces. This is a huge reduction in size
compared to prior attempts to implement a new JGit storage layer. As
this package shows, a complete JGit storage implementation is more
than 17,000 lines of fairly complex code.
A simple cache is provided in storage.dht.spi.cache. Implementers can
use CacheDatabase to wrap any other type of Database and perform fast
reads against a network based cache service, such as the open source
memcached[4]. An implementation of CacheService must be provided to
glue this spi onto the network cache.
[1] https://github.com/spearce/jgit_hbase
[2] https://github.com/spearce/jgit_cassandra
[3] http://labs.google.com/papers/bigtable.html
[4] http://memcached.org/
Change-Id: I0aa4072781f5ccc019ca421c036adff2c40c4295
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Merging into a non-master branch would result in the following message:
Merge branch 'a' into HEAD
Now the merge message is correct:
Merge branch 'a' into b
Change-Id: I488f97190e4c1711c23a7a3cbd64f8b13a87bbac
Signed-off-by: Robin Stocker <robin@nibor.org>
Duplicates cgit behaviour for merging the case where
OURS is deleted and THEIRS is modified as well as
OURS is modified and THEIRS id deleted.
Change-Id: Ia2ab4f8dc95020f2914ff01c2bf3b1bc62a9d45d
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
This fix makes sure the readPipe methods drains the stderr
pipe and close the subprocess' stdin stream before reading
the process outputs.
I never managed to repeat the reported problem myself, so this
may help in diagnosing the probelm on other peoples machines.
Bug: 337533
Change-Id: I299555f09768c34d5868327e574326946ee265e1
Smart HTTP clients may request both multi_ack_detailed and no-done in
the same request to prevent the client from needing to send a "done"
line to the server in response to a server's "ACK %s ready".
For smart HTTP, this can save 1 full HTTP RPC in the fetch exchange,
improving overall latency when incrementally updating a client that
has not diverged very far from the remote repository.
Unfortuantely this capability cannot be enabled for the traditional
bi-directional connections. multi_ack_detailed has the client sending
more "have" lines at the same time that the server is creating the
"ACK %s ready" and writing out the PACK stream, resulting in some race
conditions and/or deadlock, depending on how the pipe buffers are
implemented. For very small updates, a server might actually be able
to send "ACK %s ready", then the PACK, and disconnect before the
client even finishes sending its first batch of "have" lines. This
may cause the client to fail with a broken pipe exception. To avoid
all of these potential problems, "no-done" is restricted only to the
smart HTTP variant of the protocol.
Change-Id: Ie0d0a39320202bc096fec2e97cb58e9efd061b2d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* stable-0.12:
Fix sorting of names in RefDirectory
Make running static checks configurable in maven build
Add constants for gerrit change id configuration
RefDirectory did not correctly follow the contract of RefList. The
contract says if you use add() method of RefList builder, you MUST
sort() it afterwards, and later every other method assumes that list
is properly sorted (especially the binary search in the find() and
get() methods). Instead RefDirectory class tried to scan the refs
recursively while sorting every folder in the process before
processing and did not call sort().
For example, when scanning the contents of refs/tags project1 string
is smaller than project1-*, so it will recursively go into the folder
and add these tags first and only then will add project-* ones. This
will result in a broken list (any project1-* string is less than
project1/* one, but they all appear after them in the list), that's
why binary search will fail making loose RefList and the whole local
RefMap completely unusable.
Change-Id: Ibad90017e3b2435b1396b69a22520db4b1b022bb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In order to run the static checks run:
mvn -P static-checks clean install
Change-Id: I14077498a04be986ded123ddbfc97da8f9bc3130
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I22fc46dff6cc5dfd975f6e82161d265781778cde
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The same as with cherry-pick, the commit message of a merge should
include a "Conflicts" section when the merge resulted in conflicts.
Change-Id: I6261dc898262322924af5ca1bef841a654b0df55
Signed-off-by: Robin Stocker <robin@nibor.org>
Add private methods which are used for reading and writing MERGE_HEAD
and CHERRY_PICK_HEAD files, as suggested in the comments on change
I947967fdc2f1d55016c95106b104c2afcc9797a1.
Change-Id: If4617a05ee57054b8b1fcba36a06a641340ecc0e
Signed-off-by: Robin Stocker <robin@nibor.org>
Add handling of CHERRY_PICK_HEAD file in .git (similar to MERGE_HEAD),
which is written in case of a conflicting cherry-pick merge.
It is used so that Repository.getRepositoryState can return the new
states CHERRY_PICKING and CHERRY_PICKING_RESOLVED. These states, as well
as CHERRY_PICK_HEAD can be used in EGit to properly show the merge tool.
Also, in case of a conflict, MERGE_MSG is written with the original
commit message and a "Conflicts" section appended. This way, the
cherry-picked message is not lost and can later be re-used in the commit
dialog.
Bug: 339092
Change-Id: I947967fdc2f1d55016c95106b104c2afcc9797a1
Signed-off-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The timeout is also used in the FetchCommand called by the
CloneCommand.
The possibility to provide a list of branches to fetch initially is a
feature offered by EGit. To implement it here is a prerequisite for
EGit to be able to use the CloneCommand.
Change-Id: I21453de22e9ca61919a7c3386fcc526024742f5f
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
When no branch was specified in the clone command, HEAD pointed to a
commit after clone. Now the clone command tries to find a branch which
points to the same commit and checks out this branch.
Bug: 339354
Change-Id: Ie3844465329f213dee4a8868dbf434ac3ce23a08
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Change I61a1b45db2d60fdcc0f87373ac6fd75ac4c4a202 fixed a possible NPE
occurring for newly created repositories - but in that case a wrong
value (false = not modified) was returned.
If a current version of the index file exists (liveFile), but there is
no snapshot, this means that there have been modifications (i.e. true
has to be returned).
Change-Id: I698f78112249f9924860fc58eb7eab7afdf87eb7
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Findbugs doesn't like using implicitly initialized field in
initializer.
Change-Id: Ic1ff9011813cc02950a71df587f31ed9f8415b49
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When a new Git instance for an exisiting git repository should be
created there are two use-cases: either the application has already a
Repository instance in hand or the application knows where the
repository resides in the filesystem. Two methods are added to
explicitly support these use-cases: wrap(Repository db) and open(File
gitDir)
Change-Id: I2970e4aa8d4602cb1298f01e5b76bf0f96c492e5
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
When reading refs, avoid reading huge files that were put there
accidentally, but still read the top of e.g. FETCH_HEAD, which
may be longer than our limit. We're only interested in the first line
anyway.
Bug: 340880
Change-Id: I11029b9b443f22019bf80bd3dd942b48b531bc11
Signed-off-by: Carsten Pfieffer <carsten.pfeiffer@gebit.de>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The 'Counting objects' phase of PackWriter requires good hit rates
from the DeltaBaseCache while walking trees, the deltas need to find
their bases in the cache in order to inflate in a reasonable time.
If JGit is running in a multi-threaded server, such as Gerrit Code
Review, each thread needs its own DeltaBaseCache to prevent one thread
from evicting the other thread's relevant bases. Move the cache to be
per-ObjectReader, lazily allocated when required by a PackFile.
Change-Id: If9d5ed06728e813632ae96dcfb811f4860b276e8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If a repository has an alternate object database, the alternate has
its references advertised as ".have" lines, which permits the client
to use these as delta base candidates when generating the pack. If
setCheckReferencedObjectsAreReachable(true) is used, these additional
have lines need to be considered in addition to the advertised refs.
Change-Id: Ie39c6696f9d3ff147ef4405cd5624f6011700ce5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We sometimes face the problem that the file .git/index.lock
can't deleted causing JGit operations to fail. Problem is
that LockFile.unlock() simply deletes the lockfile and ignores the
return value of File.delete(). Instead use
FileUtils.delete() with retry option. This method will retry the
deletion of the file at most 10 times with sleeps inbetween.
Bug: 335959
Change-Id: I9598edea9f2304fe12e6f470301211b503434848
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
There should be a way to explictly refresh the refs cached in the
RefDirectory. Since commit c261b28 (use of FileSnapshot) this is
not needed anymore for storage in the filesystem. But for DHT based
storage an explicit refresh may be needed.
Change-Id: I7d30c3496c05e1fb6e9519f3af9f23c6adb93bf9
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Instead of tracking the length and modification time by hand, rely
on FileSnapshot to tell RefDirectory when the $GIT_DIR/packed-refs
file has been changed or should be re-read from disk.
Change-Id: I067d268dfdca1d39c72dfa536b34e6a239117cc3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Embedding applications can use this hook to watch actions within
UploadPack and possibly reject them. This could be useful to prevent
clones of a large repository from this server, or to stop abusive
negotiation rounds that offer thousands of objects in a single batch.
Change-Id: Id96f1885ac4d61f22c80b6418fff54184b7348ba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Take a very simple approach to avoiding delta chains during object
reuse: objects are now always selected from the oldest pack that
contains them. This prevents cycles because a pack must not have
a cycle in the delta chain. If both objects A and B are chosen
out of the same source pack then there cannot be an A->B->A cycle.
The oldest pack is also the most likely to have the smallest deltas.
Its the biggest pack in the system and probably came from the clone
(or last GC) of this repository, where all objects were previously
considered and packed tightly together. If an object appears again
(for example due to a revert and a push into this repository) the
newer copy of won't be nearly as small as the older delta version
of it, even if the newer one is also itself a delta.
ObjectDirectory already enumerates objects during selection in this
newest->oldest order, so it already is supplying these assumptions
to PackWriter. Taking advantage of this can speed up selection by
a tiny amount by avoiding some tests, but can also help to prevent
a cycle needing to be broken on the fly during writing.
The previous cycle breaking logic wasn't fully correct either.
If a different delta base was chosen, the new delta base might not
have been written into the output pack before the current object,
forcing the use of REF_DELTA when OFS_DELTA is always smaller.
This logic has now been reworked to always re-check the delta base
and ensure it gets written before the current object.
If a cycle occurs, it gets broken the same way as before, by
disabling delta reuse and finding an alternative form of the
object, which may require inflating/deflating in whole format.
Change-Id: I9953ab8be54ceb8b588e1280d6f7edd688887747
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the total number of objects to look for reuse on is under 4096
this is really close to a reasonable batch size for the DHT storage
system to lookup at once. Combine all of the objects into a single
temporary list, perform reuse, and then prune the main lists if any
duplicate objects were detected from a selected CachedPack.
The intention here is to try and avoid 4 tiny sequential lookups
on the storage system when the time to wait for each of those to
finish is higher than the CPU time required to build (and later
GC) this temporary list.
Change-Id: I528daf9d2f7744dc4a6281750c2d61d8f9da9f3a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of looping over the objectsLists array, always set slot 0 to
null and explicitly work on the 4 indexes that matter. This kills
some loops and increases the length of the code slightly, but I've
always really disliked that dummy 0 slot.
Change-Id: I5ad938501c1c61f637ffdaff0d0d88e3962d8942
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
During object enumeration for the thin pack, very few objects come
out that are duplicated with the cached pack. Typically these are
only cases where a blob or tree was cherry-picked forward, got a
copy or rename, or was reverted... all relatively infrequent events.
Speed up pruning of the thin pack object list by combining the phase
with the object representation selection. Implementers should already
be offering to reuse the object from the cached pack if it is stored
there, at which point the implementation can perform a very fast type
of containment test using the cached pack's identity rather than yet
another index lookup. For the local disk case this is probably not a
big improvement, but it does help on the DHT implementation where the
two passes combined into one reduces latency.
Change-Id: I6a07fc75d9075bf6233e967360b6546f9e9a2b33
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
PacketLineOut is already public. Make PacketLineIn partially public
in case an application needs to use some of the pkt-line protocol.
Change-Id: I5b383eca980bd9e16a7dbdb5aed040c6586d4f46
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the clone command clones into a bare repository it should not try
to checkout HEAD in the end. For bare repos checkout is not possible.
Change-Id: I359719d421b93c9d2e962e3c0eccc2b59235c3d1
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
To be more consistent with what native git does we should allow
to run the InitCommand also on existing git repos.
Change-Id: I833637842631b37dce96ed9729b3a6ed24054056
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
The snapshot field of a DirCache object for a newly created repository
can be null. This fix prevents a NPE when isModified() is called in
such a situation.
Change-Id: I61a1b45db2d60fdcc0f87373ac6fd75ac4c4a202
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
This change contains a simple renaming. Instead of using the
expression 'abnormal failure', we just treat this kind of situation
as 'failure'. This is specific enough as conflicts are already handled
separately.
Change-Id: I535acdc7d022543ed0f5ac6151b09a6985f4ef38
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
We used to normalize URI's since it seems simple. This however causes
inconsistencies to the user and to out tests. Just pass backslashes
through and make sure our parser can handle them.
Bug: 341062
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Change-Id: I2c8e917a086faabcd8749160c2acc9dd05a42838
If JGit got a cygwin path it would fail to resolve directories
relative to it.
Also add system property to debug problems with the FS utilities.
Enable debug using -Djgit.fs.debug=1
Furthermore use -Djgit.gitprefix=<path> to set the git prefix
manually.
Change-Id: I5a58ee45c2d2ae5322f87fd6972526169b2918c8
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Detaching HEAD didn't work in some corner checkout cases. If, for example,
HEAD is symbolic ref to refs/heads/master, refs/heads/master is ref to commit
c0ffee... then:
checkout c0ffee...
would leave the HEAD unchanged.
The same symptom occurs when checking out a remote tracking branch or a tag
that references the same commit as refs/heads/master.
In the above case, the RefUpdate class didn't have enough information to decide
if the update needed to detach symbolic ref because it dealt only with new/old
objectIDs. Therefore, this fix introduced the RefUpdate.detachingSymbolicRef
flag.
Bug: 315166
Change-Id: I085c98b77ea8f9104a213978ea0d4ac6fd58f49b
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
In case an underlying cherry-pick fails due to uncommitted changes, a
RebaseCommand shall fail and roll-back changes.
Change-Id: Ic22eb047fb03ac2c8391f777036b7dbf22a1b061
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Allow users of the GIT api to get to know the state of their
workingtree and index by adding a status command. The implementation
is mainly a wrapper around IndexDiff class. Better support for multiple
stages in the index (conflict situations) is still missing. An
appropriate change to IndexDiff and StatusCommand will come in a
subsequent commit.
Bug: 337296
Change-Id: Idb390375a68611853c1c903299ec678c89b081dc
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The RemoteSession interface operates like a simplified version of
java.lang.Runtime with a single exec method (and a disconnect
method). It returns a java.lang.Process, which should begin execution
immediately. Note that this greatly simplifies the interface for
running commands. There is no longer a connect method, and most
implementations will contain the bulk of their code inside
Process.exec, or a constructor called by Process.exec. (See the
revised implementations of JschSession and ExtSession.)
Implementations can now configure their connections properly without
either ignoring the proper use of the interface or trying to adhere
to an overly strict interface with odd rules about what methods are
called first. For example, Jsch needs to create the output stream
before executing, which it now does in the process constructor. These
changes should make it much easier to add alternate session
implementations in the future.
Also-by: John D Eblen <jdeblen@comcast.net>
Bug: 336749
CQ: 5004
Change-Id: Iece43632086afadf175af6638255041ccaf2bfbb
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
In order to distinguish cherry-pick failures caused by conflicts vs.
'abnormal failures' (e.g. due to unstaged changes or a dirty
worktree), a CherryPickResult class is introduced and returned by
CherryPickCommand.call() instead of a RevCommit. This new class is
similar to MergeResult and RebaseResult. The CherryPickResult contains
all necessary information, e.g. paths causing the cherry-pick (a merge
called within, respectively) to fail. This allows callers to better
react on failures.
Change-Id: I5db57b9259e82ed118e4bf4ec94463efe68b8c1f
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Add a test case for PullCommand for the successful merge case and test
that the short ref name is used.
Change-Id: I16cbbc88595f73e5512f984e67f93f87ee0fe242
Signed-off-by: Robin Stocker <robin@nibor.org>
java.lang.IndexOutOfBoundsException
at java.nio.ByteBuffer.wrap(ByteBuffer.java:352)
at org.eclipse.jgit.util.RawParseUtils.decodeNoFallback(RawParseUtils.java:913)
at org.eclipse.jgit.util.RawParseUtils.decode(RawParseUtils.java:880)
at org.eclipse.jgit.util.RawParseUtils.decode(RawParseUtils.java:839)
at org.eclipse.jgit.storage.file.ReflogReader$Entry.<init>(ReflogReader.java:102)
at org.eclipse.jgit.storage.file.ReflogReader.getReverseEntries(ReflogReader.java:183)
at org.eclipse.jgit.storage.file.ReflogReader.getReverseEntries(ReflogReader.java:162)
Change-Id: I22a18bc7193962e5018c40a75337f9976b585c40
Add paths causing abnormal merge failures (e.g. due to unstaged
changes) to the MergeResult returned by MergeCommand. This helps
callers to better handle (e.g. present) merge results.
Change-Id: Idb8cf04c5cecfb6a12cb880e16febfc3b9358564
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Frequently enough I'm wondering how much of a pack is commits vs.
trees, and the total line doesn't really tell us this because its
a gross total from the pack. Computing the counts per object type
is simple during packing, as PackWriter already has everything in
memory broken up by object type. Its virtually free to get these
values and track them.
Change-Id: Id5e6b1902ea909c72f103a0fbca5d8bc316f9ab3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Simple variant of addAll() that knows how to copy large segments
quickly using System.arraycopy() rather than looping through with
an Iterator object.
Change-Id: Icb50a8f87fe9180ea28b6920f473bb9e70c300f1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of computing this on every request, compute it once and
hold onto the result. This improves performance for LocalCachedPack
which does a lot of tests against the pack name string.
Change-Id: I3803745e3a5dda7b5f0faf39aae9423e2c777e7f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In case a file needs to be checked out (from THEIRS) during a merge
operation, it has to be checked if the worktree version of this file
is dirty. If this is true, merge shall fail.
Change-Id: I17c24845584700aad953c3d4f2bea77a0d665ec4
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
1. Perform an explicit check for untracked files.
2. Extract 'dirty checks' into separate methods
3. Clean up comments.
4. Tests: also check contents of files not affected by merge.
Change-Id: Ieb089668834d0a395c9ab192c555538917dfdc47
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
When fetching from a remote peer, consider all of the refs of any
alternate repository to be reachable locally, in addition to the refs
of the local repository. This mirrors the push protocol and may avoid
unnecessary object transfer when the local repository is empty, but
its alternate and the remote share a lot of common history.
Junio C Hamano recently proposed a similar change to C Git's fetch
client, in order to work around a performance bug I identified when
fetching between two repositories that actually shared the same
alternate repository on the local system.
Change-Id: Iffb0b70e1223901ce2caac3b87ba7e0d6634d265
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of aborting hard with a server-side exception, report an error
to the client with "ERR %s" in a context where the client is expecting
ACK/NAK. Older clients will report this text to the user, but newer
ones know how to format this message in a more user-friendly way.
Change-Id: I1879b38988ba66f648c069c10dbfa14c3f34adb2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the remote peer replies with "ERR %s" instead of "ACK %s common" or
"NAK" during ancestor negotiation in the fetch-pack/upload-pack
protocol, treat that as an exception that aborts processing with the
error text as supplied by the remote system.
This matches behavior with "ERR %s" during the advertisements, which
is also a way for the remote to abort processing.
Change-Id: I2fe818e75c7f46156744ef4f703c40173cbc76d0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Most "ACK %s continue", "ACK %s common", "NAK" strings that are read
by the readACK() method and readString() are shorter than the
lineBuffer already available. Reuse that buffer when reading from
the network stream and converting to a string with RawParseUtils to
avoid unnecessary temporary byte array allocations.
Change-Id: Ibc778d9f7721943a065041d80fc427ea50d90fff
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
An option to insert a change id into the commit message was added
to CommitCommand.
This change is a prerequisite for removing GitIndex from EGit.
Change-Id: Iff9e26a8aaf21d8224bfd6ce3c98821c077bcd82
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
This enables applications to differentiate between explicitly set
configuration parameters and best effort attempts to guess these
parameters from the operating system.
Change-Id: I67cc4099238a40c6dca795e64f0155ced6008ef1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If no directory is set before executing an InitCommand, the current
directory (".") is used by default. By calling File.getParentFile() we
get the actual directory this points to. Using this directory makes it
easier to read paths.
Change-Id: I6245941395dae920e4f90b8985be6ef3cce570d3
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
This allows callers to determine if a URI is supported, before
worrying about the local repository.
Suggested-by: Dariusz Luksza <dariusz@luksza.org>
Change-Id: Ifc76a4ba841f2e2e7354bd51306b87b3b9d7f6ab
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The simplified form of add(String) makes it easier for applications
to pass down user input and allow PushCommand to convert it to the
internal RefSpec object.
Change-Id: Ibd2e95852db0e52ea4a36032942c4c42a7fb4261
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The --all flag on the command line implies using refs/heads/* as
a push specification. Add this to the standard command object.
The --tags flag on the command line implies using refs/tags/* as
a push specification. Add this to the standard command object.
Change-Id: Iaef200b17cce77604548dbfb15cf2499b10687b5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
For compatibility reasons with regards to native git and also to
make the init command easier to use from the command line,
argument --git-dir should not be required.
Additionally the path created in case --git-dir is not supplied now is
canonical and thus easier to read.
Change-Id: Idb7d77e983a78c4b21fbf232fc1e75ef581e5ed1
Signed-off-by: Philipp Thun <philipp.thun@sap.com>
If the client is only following the remote repository and has not
created any new non-common commits, the client will wind up sending
a "have %s" line for each tag in the repository. For some projects
like git.git, this is 339 tags and growing, resulting in more than
16 KiB needing to be POSTed over 12 HTTP requests.
Teach UploadPack (server side) to always execute the okToGiveUp()
logic at least once per negotiation round to determine if the server
can compute a pack right now. If it can, shove in an "ACK %s ready"
message to tell the client this and try to prevent receiving ancient
tags in future negotiation rounds.
Teach BasePackFetchConnection (client side) to honor a "ACK %s ready"
from the remote and break out of its SEND_HAVE loop once the remote
knows it can create a pack. This avoids sending the remaining 307
tags of git.git.
These two changes together reduce the number of HTTP RPCs from 13
down to 3 in order to fetch from git.git over smart HTTP. If either
side is missing the change, the older behavior (and its 13 RPCs)
is used.
Change-Id: I64736318fd0abf9ee5e56bd0b737707adb580b37
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This permits an application to create its own copy of FS.DETECTED
before manually setting the userHome or gitPrefix.
Bug: 337101
Change-Id: Ieea33c8d0ebdc801a4656b829d2a4b398559fd45
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This permits callers to modify the meaning of userHome, which
may be useful if their application allows the user to select
different user settings locations.
Bug: 337101
Change-Id: I076815edeec1c20dea028f7840be3930337dff77
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This permits callers to modify the meaning of gitPrefix, which
may be useful if their application allows the user to select
the location where C Git is installed.
Bug: 337101
Change-Id: I07362a5772da4955e01406bdeb8eaf87416be1d6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This allows callers to perform the logic that constructed the
current FS.DETECTED value.
Change-Id: Id8517d131dcc3f675c60b2d935730872695ed1b0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
C Git always fetches tags during clone, even if the tag doesn't
point to an object that was fetched by the branch specifications.
Match that behavior, as users expect it.
Bug: 326611
Change-Id: I81a82b7359a9649f18a172219da44ed54e77ca2f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If no RefSpec was specified, push the branch that is currently
checked out as HEAD.
Change-Id: I6f13ef6346188698a14e000fc590850afbc34b21
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Rather than copying the entire list, just replace each element
with the version that has setForceUpdate(true) invoked on it.
Change-Id: I2eaa4466d497cb2408ce61dc62ca26e0c32fe841
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This better matches with PackFile and CachedPack's methods
that return the same value.
Change-Id: Idb9b7c71d2048dd2344a62c2cde20b4e34529ab7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
PackWriter incorrectly returned 0 from getObjectsNumber() when the
pack has not been written yet. This caused dumb transports like
amazon-s3:// and sftp:// to abort early and never write out a pack,
under the assumption that the pack had no objects.
Until the pack header is written to the output stream, compute the
current object count each time it is requested. Once the header is
started, use the object count from the stats object.
Change-Id: I041a2368ae0cfe6f649ec28658d41a6355933900
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
OwnerMap is about 200 ms faster than SubclassMap, more friendly to the
GC, and uses less storage: testing the "Counting objects" part of
PackWriter on 1886362 objects:
ObjectIdSubclassMap:
load factor 50%
table: 4194304 (wasted 2307942)
ms spent 36998 36009 34795 34703 34941 35070 34284 34511 34638 34256
ms avg 34800 (last 9 runs)
ObjectIdOwnerMap:
load factor 100%
table: 2097152 (wasted 210790)
directory: 1024
ms spent 36842 35112 34922 34703 34580 34782 34165 34662 34314 34140
ms avg 34597 (last 9 runs)
The major difference with OwnerMap is entries must extend from
ObjectIdOwnerMap.Entry, where the OwnerMap has injected its own
private "next" field into each object. This allows the OwnerMap to use
a singly linked list for chaining collisions within a bucket. By
putting collisions in a linked list, we gain the entire table back for
the SHA-1 bits to index their own "private" slot.
Unfortunately this means that each object can appear in at most ONE
OwnerMap, as there is only one "next" field within the object instance
to thread into the map. For types that are very object map heavy like
RevWalk (entity RevObject) and PackWriter (entity ObjectToPack) this
is sufficient, these entity types are only put into one map by their
container. By introducing a new map type, we don't break existing
applications that might be trying to use ObjectIdSubclassMap to track
RevCommits they obtained from a RevWalk.
The OwnerMap uses less memory. Each object uses 1 reference more (so
we're up 1,886,362 references), but the table is 1/2 the size (2^20
rather than 2^21). The table itself wastes only 210,790 slots, rather
than 2,307,942. So OwnerMap is wasting 200k fewer references.
OwnerMap is more friendly to the GC, because it hardly ever generates
garbage. As the map reaches its 100% load factor target, it doubles in
size by allocating additional segment arrays of 2048 entries. (So the
first grow allocates 1 segment, second 2 segments, third 4 segments,
etc.) These segments are hooked into the pre-allocated directory of
1024 spaces. This permits the map to grow to 2 million objects before
the directory itself has to grow. By using segments of 2048 entries,
we are asking the GC to acquire 8,204 bytes in a 32 bit JVM. This is
easier to satisfy then 2,307,942 bytes (for the 512k table that is
just an intermediate step in the SubclassMap). By reusing the
previously allocated segments (they are re-hashed in-place) we don't
release any memory during a table grow.
When the directory grows, it does so by discarding the old one and
using one that is 4x larger (so the directory goes to 4096 entries on
its first grow). A directory of size 4096 can handle up to 8 millon
objects. The second directory grow (16384) goes to 33 million objects.
At that point we're starting to really push the limits of the JVM
heap, but at least its many small arrays. Previously SubclassMap would
need a table of 67108864 entries to handle that object count, which
needs a single contiguous allocation of 256 MiB. That's hard to come
by in a 32 bit JVM. Instead OwnerMap uses 8192 arrays of about 8 KiB
each. This is much easier to fit into a fragmented heap.
Change-Id: Ia4acf5cfbf7e9b71bc7faa0db9060f6a969c0c50
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Use the Java 6 like services approach to find all supported
TransportProtocols within the CLASSPATH and load them all for use.
This allows users to inject additional protocol implementations simply
by putting their JARs on the application CLASSPATH, provided the
protocol author has written the proper services file.
Change-Id: I7a82d8846e4c4ed012c769f03d4bb2461f1bd148
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The new TransportProtocol type describes what a particular Transport
implementation wants in order to support a connection. 3rd parties
can now plug into the Transport.open() logic by implementing their
own TransportProtocol and Transport classes, and registering with
Transport.register().
GUI applications can help the user configure a connection by looking
at the supported fields of a particular TransportProtocol type, which
makes the GUI more dynamic and may better support new Transports.
Change-Id: Iafd8e3a6285261412aac6cba8e2c333f8b7b76a5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>