Rewrite this complicated logic to examine each pack file exactly
once. This reduces thrashing when there are many large pack files
present and the reader needs to locate each object's header.
The intermediate temporary list is now smaller, it is bounded to
the same length as the input object list. In the prior version of
this code the list contained one entry for every representation of
every object being packed.
Only one representation object is allocated, reducing the overall
memory footprint to be approximately one reference per object found
in the current pack file (the pointer in the BlockList). This saves
considerable working set memory compared to the prior version that
made and held onto a new representation for every ObjectToPack.
Change-Id: I2c1f18cd6755643ac4c2cf1f23b5464ca9d91b22
Clip the configured limit to Integer.MAX_VALUE at the top of the
loop, saving a compare branch per object considered. This can cut
2M branches out of a repacking of the Linux kernel.
Rewrite the logic so the primary path is to match the conditional;
most objects are larger than BLKSZ (16 bytes) and less than limit.
This may help branch prediction on CPUs if the CPU tries to assume
execution takes the side of the branch and not the second.
Change-Id: I5133d1651640939afe9fbcfd8cfdb59965c57d5a
There is no reasonable way for a subclass to correctly override and
implement these methods. They depend on internal state that cannot
otherwise be managed.
Most of these methods are also in critical paths of PackWriter.
Declare them final so subclasses do not try to replace them,
and so the JIT knows the smaller ones can be safely inlined.
Change-Id: I9026938e5833ac0b94246d21c69a143a9224626c
None of these methods should ever be overridden at runtime by an
extension class. Given how small they are the JIT should perform
inlining where reasonable. Hint this is possible by marking all
methods final so its clear no replacement can be loaded later on.
Change-Id: Ia75a5d36c6bd25b24169e2bdfa360c8f52b669cd
This flag is never checked on its own. It is only checked as part
of a pair through the doNotAttemptDelta() method. Delete the method
so there is less confusion about the flag being used on its own.
Change-Id: Id7088caa649599f4f11d633412c2a2af0fd45dd8
This method is only invoked with true as the argument.
Remove the unnecessary parameter and branch, making
the code easier for the JIT to optimize.
Change-Id: I68a9cd82f197b7d00a524ea3354260a0828083c6
Added also tests and the associated option for the command line Merge
command.
Bug: 335091
Change-Id: Ie321c572284a6f64765a81674089fc408a10d059
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Due to the Git internal sort order a directory is sorted as if it ended
with a '/', this means that the path filter didn't set the last possible
matching entry to the correct value. In the reported issue we had the
following filters.
org.eclipse.jgit.console
org.eclipse.jgit
As an optimization we throw a StopWalkException when the walked tree
passes the last possible filter, which was this:
org.eclipse.jgit.console
Due to the git sorting order, the tree was processed in this order:
org.eclipse.jgit.console
org.eclipse.jgit.test
org.eclipse.jgit
At org.eclipse.jgit.test we threw the StopWalkException preventing the
walk from completing successfully.
A correct last possible match should be:
org.eclipse.jgit/
For simplicit we define it as:
org/eclipse/jgit/
This filter would be the maximum if we also had e.g. org and org.eclipse
in the filter, but that would require more work so we simply replace all
characters lower than '/' by a slash.
We believe the possible extra walking does not not warrant the extra
analysis.
Bug: 362430
Change-Id: I4869019ea57ca07d4dff6bfa8e81725f56596d9f
1. I have authored 100% of the content I'm contributing,
2. I have the rights to donate the content to Eclipse,
3. I contribute the content under the EDL
Change-Id: I48b1828e0b1304f76276ec07ebac7ee9f521b194
Problem:
LogCommand.all() throws an IncorrectObjectTypeException when
there are tag references, and the repository does not contain
the file "packed-refs". It seems that the references were not properly
peeled before being added to the markStart() method.
Solution:
Call getRepository().peel() on every Ref that has isPeeled()==false
in LogCommand.all() .
Added test case for LogCommand.all() on repo with a tag.
1. I have authored 100% of the content I'm contributing,
2. I have the rights to donate the content to Eclipse,
3. I contribute the content under the EDL
Bug: 402025
Change-Id: Idb8881eeb6ccce8530f2837b25296e8e83636eb7
Instead of re-reading all refs after each update, execute
the deletes first, then read all refs once and perform
the check for conflicting ref names in memory.
Change-Id: I17d0b3ccc27f868c8497607d8e57bf7082e65ba3
Allowed ipv6-address in a uri like:
http://[::1]:8080/repo.git
Change-Id: Ia00a20f694b2e9314892df77f9b11f551bb1d34e
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
This wrong book-keeping caused IOExceptions to be thrown because
LockFile.unlock() erroneously tried to delete the non-existing lock
file. These IOExeptions were hidden since they were silently caught.
Change-Id: If42b6192d92c5a2d8f2bf904b16567ef08c32e89
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Instead of forcing the implementation of the DFS backend to handle
making sure the extension bits are set correctly, have the common
callers in JGit set the extension at the same time they supply the
file sizes to the pack description. This simplifies assumptions for
an implementation of the DFS backend.
Change-Id: I55142ad8ea08a3e2e8349f72b3714578eba9c342
On Windows renameTo will not overwrite a file, so it must be deleted
first. The fix for Bug 402834 did not account for that.
Bug: 403685
Change-Id: I3453342c17e064dcb50906a540172978941a10a6
Unlike the OS or Java rename this method will (on *nix) try (on Windows)
replace the target with the source provided the target does not exist,
the target does exist and is a file, or if it is a directory which only
contains directories. In the latter case the directory hierarchy will be
deleted.
If the initial rename fails and the target is an existing file the the
target file will be deleted first and then the rename is retried.
Change-Id: Iae75c49c85445ada7795246a02ce02f7c248d956
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Allow users to provide their OutputStream (via Transport#
push(monitor, refUpdates, out)) so that server messages can be written
to it (in SideBandInputStream) while they're coming in.
CQ: 7065
Bug: 398404
Change-Id: I670782784b38702d52bca98203909aca0496d1c0
Signed-off-by: Andre Dietisheim <andre.dietisheim@gmail.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When running the garbage collection for a repository it is often
interesting to compare the repository statistics from before and after
the garbage collection to understand the effect of the garbage
collection. This is why it makes sense that the
GarbageCollectionCommand provides a method to retrieve the repository
statistics before running the garbage collection.
So far without running the garbage collection the repository statistics
can only be retrieved by using JGit internal classes. This is what EGit
and Gerrit do at the moment, but it would be better to have an API for
this.
Change-Id: Id7e579157e9fbef5cfd1fc9f97ada45f0ca8c379
Signed-off-by: Edwin Kempin <edwin.kempin@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This will simplify to adapt EGit to the removal of FileRepository from
jgit's public API in change I2ab1327c202ef2003565e1b0770a583970e432e9.
Change-Id: If98b0b97e8f13a94d4ea7ba1be0f90d82b0fba4b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Only on Windows the rename operation which renames temporary Packfiles
(and index-files and bitmap-files) sometime fails. This happens only
when renaming a temporary Packfile to a Packfile which already exists.
Such situations occur if you run GC twice on a repo without modifying
the repo inbetween.
In such situations there was bug in GC which led to a corrupted repo
whithout any packfiles anymore. This commit fixes the problem by
introducing a utility method which renames a file and throws an
IOException if it fails. This method also takes care to repeat a
failing rename if our FS class has found out we are running on a
platform with a unreliable File.renameTo() method.
I am searching for a better solution because even with this utility
method in hand a GC on a already GC'ed repo will fail on Windows. But
at least with this fix we will not produce corrupted repos anymore.
Bug: 389305
Change-Id: Iac1ab3e0b8c419c90404f2e2f3559672eb8f6d28
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
With JGit it is possible to write reflog entries where new objectid and
old objectid is null. Such reflogs cause FileRepository GC to crash
because it doesn't expect the new objectid to be null. One case where
this happened is in Gerrit's allProjects repo. In the same way as we
expect the old objectid to be potentially null we should also ignore
null values in the new objectid column.
Change-Id: Icf666c7ef803179b84306ca8deb602369b8df16e
This breaks all existing callers once. Applications are not supposed
to build against the internal storage API unless they can accept API
churn and make necessary updates as versions change.
Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9
* changes:
Remove cached_packs support in favor of bitmaps
Remove objects before optimization from DfsGarbageCollector
Simplfy caching of DfsPackDescription from PackWriter.Statistics
As noticed by Robin Rosenberg in review of
I4eb87c850078ca187b38b81cc91c92afb1176945.
Change-Id: If96d66b6c025ad8f2f47829c933f3c65ab6cbeef
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Continuing is trickier, as .git/rebase-apply contains no message file
and no git-rebase-todo.
Bug: 336820
Change-Id: I4eb87c850078ca187b38b81cc91c92afb1176945
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Treat first parent traversals as 1 and higher parents as MERGE_COST,
to match git name-rev. Allow overriding the merge cost during tests to
avoid creating 2^16 commits on the fly.
Change-Id: I0175e0c3ab1abe6722e4241abe2f106d1fe92a69
This is helpful for writing the pack configuration into a log file.
Change-Id: I5e7f5ff7e01c9538ca12a1860844ba9b467bdf05
Signed-off-by: Edwin Kempin <edwin.kempin@sap.com>
This is helpful for writing the repository statistics into a log file.
Change-Id: I0e8cd9ad05f123ab3851960890a50213f353a373
Signed-off-by: Edwin Kempin <edwin.kempin@sap.com>
The bitmap code in PackWriter knows exactly when to use a pack as
a "cached pack". It enables cached pack usage only when the pack
has a bitmap and its entire closure of objects needs to be sent.
This is a much simpler code path to maintain, and JGit actually
has a way to write the necessary index.
Change-Id: I2645d482f8733fdf0c4120cc59ba9aa4d4ba6881
Just counting objects is not sufficient. There are some race
conditions with receive packs and delta base completion that
may confuse such a simple algorithm.
Instead always do the larger set computations, and rely on the
PackWriter having no objects pending as the way to avoid creating
an empty pack file.
Change-Id: Ic81fefb158ed6ef8d6522062f2be0338a49f6bc4
Let the pack description copy the relevant stats values. This
moves it out of the garbage collector and compactor algorithms,
co-locating with something that might care.
Remove some unnecessary code from the DfsPackCompactor, the stats
tracks the same information and can supply it.
Change-Id: Id64ab38d507c0ed19ae0d106862d175b7364eba3
Prefer ~(N+1) to ^1~N. Although both are correct, the former is
cleaner and matches "git name-rev".
Change-Id: I772001a219e5eb346f5552c92e6d98c70b2cfa98
The walk logic does not use RevWalk because it needs to walk all paths
to each of the requested commits, keeping track of each path along which
the commit was found in the RevCommit subclass. From these paths, a
single "best" path is chosen based on the total path length, with a
penalty applied for paths that traverse merges.
This functionality parallels "git name-rev".
Change-Id: I92bfb47dd16c898313d2ee525395609c3bf72ebe
This fixes two cases:
- A folder without tracked content exist both in the workdir and merged
commit, as long as there names within that folder does not conflict.
- An empty folder structure exists with the same name as a file in the
merged commit.
Bug: 402834
Change-Id: I4c5b9f11313dd1665fcbdae2d0755fdb64deb3ef
Clients send a bunch of unknown objects to UploadPack on each round
of negotiation. Many of these are not known to the server, which
leads the implementation to be looking at indexes for garbage packs.
Disable examining the index of a garbage pack, allowing servers to
avoid reading them from disk during negotiation.
The effect of this change is the server will only ACK a have line
if the object was reachable during the last garbage collection,
or was recently added to the repository. For most repositories
there is no impact in this behavior change.
If a repository rewinds a branch, runs GC, and then resets the
branch back to where it was before, the now current tip is going to
be skipped by this change. A client that has the commit may wind up
getting a slightly larger data transfer from the server as an older
common ancestor will be chosen during negotiation. This is fixable
on the server side by running GC again to correct the layout of
objects in pack files.
Change-Id: Icd550359ef70fc7b701980f9b13d923fd13c744b
The DHT backend was very slow at parsing objects. To work around
that performance limitation I obfuscated UploadPack by folding both
the want and have sets together in a single parse queue. Since DHT
was removed the complexity is no longer constructive to JGit.
Doing this refactoring prepares the code for a slightly future
change where the have lines need to be handled specially from the
want lines. Splitting the parsing up into two phases makes such
a modification trivial.
Change-Id: If7aad533b82448bbb688278e21f709282e5ccf4b
Garbage is unlikely to be used by a reader. Ensure they always
cluster at the end of the search list, no matter what timestamp
was used on the pack files.
Change-Id: I3bed89e9569ee3363c36bb3f73fcd34057a3883f
If a repository has significant amounts of unreachable garbage the
final phase to coalesce it can take longer than any other part of the
garbage collection phase. Provide a setting for applications to tweak
the threshold where coalescing ends and files just remain on disk.
Change-Id: I5f11a998a7185c75ece3271d8bc6181bb83f54c1
Rebase computes the list of commits that are included in
the merges, just like Git does, so do not try to include
the merge commits. Re-recreating merges during rebase is
a bit more complicated and might be a useful future extension,
but for now just linearize during rebase.
Change-Id: I61239d265f395e5ead580df2528e46393dc6bdbd
Signed-off-by: Robin Stocker <robin@nibor.org>
The new option EMPTY_DIRECTORIES_ONLY will make delete() only delete
empty directories. Any attempt to delete files will fail. Can be
combined with RECURSIVE to wipe out entire tree structures and
IGNORE_ERRORS to silently ignore any files or non-empty directories.
Change-Id: Icaa9a30e5302ee5c0ba23daad11c7b93e26b7445
Signed-off-by: Robin Stocker <robin@nibor.org>
If a pack file has been marked invalid due to a prior IOException
accessing its contents, do not offer its bitmap index to callers.
The pack cannot be used so its bitmap should be off limits from
any reader trying to work from a bitmap.
Change-Id: Ia44e46558abdddee560bb184158b1e0af9437eee
Bitmaps provide a huge performance boost for counting objects and they
play nice with the cgit implementation.
Change-Id: I33b05a6c8f1ee2df7770f0b9fdc50d0b4bbf1029
Update the dfs and file GC implementations to prepare and write
bitmaps on the packs that contain the full closure of the object
graph. Update the DfsPackDescription to include the index version.
Change-Id: I3f1421e9cd90fe93e7e2ef2b8179ae2f1ba819ed
Update the PackWriter to support writing out pack bitmap indexes,
a parallel ".bitmap" file to the ".pack" file.
Bitmaps are selected at commits every 1 to 5,000 commits for
each unique path from the start. The most recent 100 commits are
all bitmapped. The next 19,000 commits have a bitmaps every 100
commits. The remaining commits have a bitmap every 5,000 commits.
Commits with more than 1 parent are prefered over ones
with 1 or less. Furthermore, previously computed bitmaps are reused,
if the previous entry had the reuse flag set, which is set when the
bitmap was placed at the max allowed distance.
Bitmaps are used to speed up the counting phase when packing, for
requests that are not shallow. The PackWriterBitmapWalker uses
a RevFilter to proactively mark commits with RevFlag.SEEN, when
they appear in a bitmap. The walker produces the full closure
of reachable ObjectIds, given the collection of starting ObjectIds.
For fetch request, two ObjectWalks are executed to compute the
ObjectIds reachable from the haves and from the wants. The
ObjectIds needed to be written are determined by taking all the
resulting wants AND NOT the haves.
For clone requests, we get cached pack support for "free" since
it is possible to determine if all of the ObjectIds in a pack file
are included in the resulting list of ObjectIds to write.
On my machine, the best times for clones and fetches of the linux
kernel repository (with about 2.6M objects and 300K commits) are
tabulated below:
Operation Index V2 Index VE003
Clone 37530ms (524.06 MiB) 82ms (524.06 MiB)
Fetch (1 commit back) 75ms 107ms
Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB)
Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB)
Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB)
Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB)
Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB)
Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d
A pack bitmap index is an additional index of compressed
bitmaps of the object graph. Furthermore, a logical API of the index
functionality is included, as it is expected to be used by the
PackWriter.
Compressed bitmaps are created using the javaewah library, which is a
word-aligned compressed variant of the Java bitset class based on
run-length encoding. The library only works with positive integer
values. Thus, the maximum number of ObjectIds in a pack file that
this index can currently support is limited to Integer.MAX_VALUE.
Every ObjectId is given an integer mapping. The integer is the
position of the ObjectId in the complete ObjectId list, sorted
by offset, for the pack file. That integer is what the bitmaps
use to reference the ObjectId. Currently, the new index format can
only be used with pack files that contain a complete closure of the
object graph e.g. the result of a garbage collection.
The index file includes four bitmaps for the Git object types i.e.
commits, trees, blobs, and tags. In addition, a collection of
bitmaps keyed by an ObjectId is also included. The bitmap for each entry
in the collection represents the full closure of ObjectIds reachable
from the keyed ObjectId (including the keyed ObjectId itself). The
bitmaps are further compressed by XORing the current bitmaps against
prior bitmaps in the index, and selecting the smallest representation.
The XOR'd bitmap and offset from the current entry to the position
of the bitmap to XOR against is the actual representation of the entry
in the index file. Each entry contains one byte, which is currently
used to note whether the bitmap should be blindly reused.
Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
Update the ObjectReuseAsIs API to support creating new
ObjectToPack with only the AnyObjectId and Git object type. This is
needed to support the future pack index bitmaps, which only contain
this information and do not want the overhead of creating a temporary
object for every ObjectId.
Change-Id: I906360b471412688bf429ecef74fd988f47875dc
CloneCommand has been creating fetch refspecs like this on bare clones:
[remote "origin"]
url = ssh://example.com/my-repo.git
fetch = +refs/heads/*:refs/heads//*
As you can see, the destination ref pattern has a superfluous slash.
It looks like this behaviour has always been the case for CloneCommand,
at least since cc2197ed when code catering to bare-clone fetch refspecs
was added. That was released with JGit v1.0 almost 2 years ago, so
there will probably be some bare repos in the wild which will have been
cloned with JGit and have these corrupted refspecs.
The effect of the corrupted fetch refspec is quite interesting. Up to
and including JGit 2.0, the corrupt refspec was tolerated and fetches
would work as intended with no indication to the user that anything was
amiss. With JGit 2.1, a change was introduced which made JGit less
tolerant, and fetches now attempt to update the non-existing ref
"refs/heads//master". No exception is raised, but the real ref -
"refs/heads/master" - is not updated.
This behaviour was noticed by a user of Agit (which does bare clones by
default and recently updated from JGit v2.0 to v2.2), reported here:
https://github.com/rtyley/agit/issues/92
If you run C-Git fetch on a bare-repo cloned by JGit, it flat-out
rejects the refspec (checked against v1.7.10.4):
fatal: Invalid refspec '+refs/heads/*:refs/heads//*'
Incidentally, C-Git does not create an explicit fetch refspec at all
when performing a bare clone - the full remote config generated by C-Git
looks like this:
[remote "origin"]
url = ssh://example.com/my-repo.git
Using JGit on such a repository works fine, so omitting the fetch
refspec entirely is also an option.
Change-Id: I14b0d359dc69b8908f68e02cea7a756ac34bf881
No longer invoke the expensive RefDatabase.isNameConflicting() check on
updating existing refs, reducing batch ref update time by ~97%.
The RefDirectory implementation of isNameConflicting() is quite
slow (it has to do an expensive loose-ref scan) but it's only necessary
to perform this check on ref update if the ref is being *created* - if
the ref already exists, we can already guarantee that it does not
conflict with any other refs.
C-Git seems to use a similar condition before making the
is_refname_available() check:
https://github.com/git/git/blob/v1.8.1.4/refs.c#L1660-L1670
As an example of the effects on performance, here's a simple timing
experiment using The BFG to remove one file from the JGit repo:
---
$ wget http://repo1.maven.org/maven2/com/madgag/bfg-repo-cleaner/1.0.1/bfg-1.0.1.jar
$ git clone --mirror https://git.eclipse.org/r/p/jgit/jgit.git
$ java -jar bfg-1.0.1.jar -D make_jgit.sh jgit.git
....
Updating references: 100% (5760/5760)
...Ref update completed in 148,949 ms.
BFG run is complete!
---
The execution time for the run is completely dominated by the batch ref
update at the end. Repeating the experiment with BFG v1.0.2 (using JGit
patched with this change), the refs update is dramatically reduced:
---
Updating references: 100% (5760/5760)
...Ref update completed in 4,327 ms.
---
Change-Id: I9057bc4ee22f9cc269b1cc00c493841c71527cd6
Previously a PackFile class was assumed to only support a .pack and .idx
file. Update the constructor to enumerate the supported extensions for
the pack file. This will allow the bitmap code to only be executed if
the bitmap extension file is known to exist.
Change-Id: Ie59041dffec5f60d7ea2771026ffd945106bd4bf
When a lot of commits are added to DateRevQueue, the
sort-on-insertion approach is very heavy on CPU cycles.
One approach to fix this was made by Dave Borowitz:
https://git.eclipse.org/r/#/c/5491/
But using Java's PriorityQueue seems to have brought some
extra overhead, and the desired performance could not be
reached.
This fix takes another approach to the insertion problem,
without changing the expected behaviour or bringing extra
memory overhead:
If we detect over 1000 commits in the DateRevQueue, a
"seek-index" is rebuilt every 1000th added commit.
The index keeps track of every 100th commit in the
DateRevQueue. During insertions, it will be used for a
preliminary scanning (binary search) of the queue, with
the intention of helping add() find a good starting point
to start walking from. After finding this starting point,
add() will step commit-by-commit until the correct
insertion place in the queue is found (today, the queue
is expected to be sorted at all times).
When applied to repositories with many refs, this approach
has proven to bring huge performance gains and scales quite
well.
For instance, in a repository with close to 80000 refs,
we could cut down the time a typical Gerrit replication
of 1 commit would take (just a push from JGit's point of
view) from 32sec down to 3.5sec.
Below you see some typical times to add a specific amount
of commits (with random commit times) to the DateRevQueue
and the difference the preliminary seek-index makes:
Commits | Index | No Index
1024 8ms 8ms
2048 13ms 9ms
4096 5ms 59ms
8192 11ms 595ms
16384 22ms 3058ms
32768 64ms 13811ms
65536 201ms 62677ms
131072 783ms 331585ms
Only one extra reference is needed for every 100 inserted
commits (and only when we see more than 1000 commits in
the queue), so the memory overhead should be negligible.
Various index-stepping values were tested, and 100 seemed to
scale very well and be effective from start.
In the future, it should probably be dynamic and based on
the number of refs in the queue, but this should serve well
as a starting point.
Note: While other fundamentally different data structures may
be more suitable, the DateRevQueue is extremely central to
many of the Git core operations. This approach was chosen,
since the effect of the patch is easy to predict in conjuction
with the current implementation. A totally new data structure
will make it harder to predict behaviour in many common and
uncommon cases (in terms of breaking ties, memory usage, cost
when using few elements, object creation/disposing overhead,
etc).
Change-Id: Ie7b99f40eacf6324bfb4716d82073adeda64d10f
Extend ResolveMerger with RecursiveMerger to merge two tips
that have up to 200 bases.
Bug: 380314
CQ: 6854
Change-Id: I6292bb7bda55c0242a448a94956f2d6a94fddbaa
Also-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* stable-2.3:
Prepare 2.3.2-SNAPSHOT builds
JGit v2.3.1.201302201838-r
Accept Change-Id even if footer contains not well-formed entries
Fix false positives in hashing used by PathFilterGroup
Change-Id: I5882aa3b482d6bcd40a45bed51e5ab03f018a5bc
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Instead of only looking for a Change-Id in the last section if it
consists only of well-formed "key: value" lines replace the last
occurrence of a valid Change-Id line in the last section. Some tools
require footer lines e.g. without a colon.
Gerrit doesn't accept Change-Id lines in the footer if the Change-Id
line doesn't start at the beginning of the line.
Bug: 400818
Change-Id: Icce54872adc8c566994beea848448a2f7ca87085
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The ByteArraySet failed to check the length of the entry correctly leading
to matches where no match should be.
Bug: 401249
Change-Id: I925bc48d9cafcdf13e1a797bb09fc2555eb270c5
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
* stable-2.3:
Prepare post 2.3.0.201302130906 builds
JGit v2.3.0.201302130906
Replace explicit version by property where possible
Add better documentation to DirCacheCheckout
Prepare post 2.3rc1 builds
JGit v2.3.0.201302060400-rc1
Change-Id: I5a7626014dc38e7623937a4241dbf8a6db05c1f9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
There is a huge performance issue when using both JGit (EGit) and Git
because JGit does not fill all dircache stat fields with the values Git
would expect. As a result thereof Git would typically revalidate a large
number of tracked files. This can take several minutes for large
repositories with many large files.
Since 1.8.2 Git will restrict stat checking to the size and whole second
part of the modification time stamp, if core.statinfo is set to
"minimal".
As JGit checks only size and modification time this is close to what
JGit already does. To make the match perfect ignore the sub-second part
of the modification time stamp if core.statinfo = minimal.
Change-Id: I8eaff1858a891571075a86db043f9d80da3d7503
This has the same logic as isNameConflicting, but instead of only
returning a boolean, it returns a collection of names that conflict.
It will be used in EGit to provide a better message to the user when
validating a ref name, see Ibea9984121ae88c488858b8a8e73b593195b15e0.
Existing implementations of isNameConflicting could be rewritten like
this:
return !getConflictingNames(name).isEmpty();
But I'm not sure about that, as isNameConflicting can be implemented in
a faster way than getConflictingNames.
Change-Id: I11e0ba2f300adb8b3612943c304ba68bbe73db8a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Instead of the complicated strange stuff, implement staah
apply as cherry-pick.
Provided there are no conflicts and it is requested that
the index should be applied, perform yet another cherry-pick,
but discard tha results thereof it that would result in conflicts.
Bug: 376035
Change-Id: I553f3a753e0124b102a51f8edbb53ddeff2912e2
In order to be able to determine the range of the first header line
(e.g. "diff --git a/file1 b/file2") in subclasses, the code that formats
the first header line is extracted.
Required by egit's change: Ia61398146c0336ab332234f24d341561292554db
Change-Id: I9dd5eb964ed8b6869745c3162159b7425ac2c44a
Signed-off-by: Tobias Pfeifer <to.pfeifer@sap.com>
getObjectList() returns a list of ObjectToPack. These can hold on to a
lot of memory. Furthermore, binary searching for objects in a sorted
array can be slow. Improve the speed and reduce the memory by creating a
copy of the ObjectId and inserting it into an ObjectIdOwnerMap.
Change-Id: Ib5aa5b7447e05938b47fa55812a87b9872c20ea7
This adds a new optional TreeFilter[] argument to DiffEntry.scan. All
filters will be checked during the scan to determine if an entry should
be "marked" with regard to that filter.
After having called scan, the user can then call isMarked(int) on the
entries to find out whether they matched the TreeFilter with the passed
index.
An example use case for this is in the file diff viewer of EGit's
History view, where we'd like to highlight entries that are matching the
current filter.
See EGit change I03da4b38d1591495cb290909f0e4c6e52270e97f.
Bug: 393610
Change-Id: Icf911fe6fca131b2567514f54d66636a44561af1
Signed-off-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In case of a conflict during cherry-pick or revert the commit message
was amended after the footer. This made the footer invalid. Many users
do not understand that they have to edit the commit message in order to
make it valid again.
Change-Id: I7e7fae125129e2a0d8950510550acda766531835
Bug: 367416
Additionally expose methods to find the first footer line and to find
the position of the ChangeId footer in the commit message in order to
enable reuse of these methods introduced for the fix.
Change-Id: I87ecca009ca3bff1ca0de3c436ebd95736bf5a10
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Previously the result of an application would have been \r\r\n in the
case of windows line endings, as RawText does not touch the \r, and
ApplyCommand adds "\r\n" if this is the ending of the first line in the
target file. Only always adding \n should be ok, since \r\n would be the
result if the file and the patch include windows line endings.
Also add according test.
Change-Id: Ibd4c4948d81bd1c511ecf5fd6c906444930d236e
Previously, the Dfs GC excluded objects from packs by passing a
previously written index to the PackWriter. Reading back a file on
Dfs is slow. Instead, allow the PackWriter to expose the objects
included in a pack and forward that to invocations of excludeObjects() .
Change-Id: I377cb4ab07f62cf790505e1eeb0b2efe81897c79
FastForwardMode is represented by different strings depending on context
it is set or get from. E.g. FastForwardMode.FF_ONLY for
branch.<name>.mergeoptions is "--ff-only" but for merge.ff it is "only".
Change-Id: I39ae93578e4783de80ebf4af29ae23b3936eec47
PackConstants previously contained string values for the pack and pack
index extension. Change PackConstant to be PackExt, a typed wrapper
around the string pack file extension.
Change-Id: I86ac4db6da8f33aa42d6f37cfcc119e819444318
The primary purpose of the filter is to detect an index change that
could possibly lead to a change in what files are visible in the staging
view and decorations. Besides what TreeFilter.ANY_DIFF does for trees in
general, this filter also looks at the assume-valid (CE_VALID) flag to
see whether changes should be ignored or not.
Change-Id: I13e9ed4ae62dc3851204fba598239edce07ca977
The method is used in only one location (DfsPackFile). Furthermore,
PackIndex already does an explicit computation of the size in
DfsPackFile. Simplify the DfsPackDescription by removing the method
and do the calculation similar to PackIndex.
Change-Id: I1391fdaaf7c2c3226d96ada1ae8647bcdff4794e
Previously the size getters and setters had explicit methods for index
and pack. Update the api to be based on the file extension. This will
make it possible to support other extensions in the future, such as
the forthcoming bitmap extensions.
Change-Id: Iab9d4abe0af65b2fc71ad71ef1db0feb6b3b5c58
If multiple threads attempted to insert loose objects into the same new
fan-out directory, the creation of that directory was subject to a race
condition that could lead to an unnecessary IOException being thrown -
because an inserter could not 'create' a directory that had just been
generated by a different thread. All we require is that the directory
does indeed *exist*, so not being able to _create_ it is not actually a
fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir()
fixes the issue.
I found this issue as a real world occurrence while working on The BFG
Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which
concurrently performs a lot of object creation.
In order to demonstrate the problem here I've added a small test case
which reliably reproduces the issue on the few different hardware
systems I've tried. The error thrown when the race-condition arises is
this:
java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed
at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182)
at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590)
at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113)
at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91)
at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329)
Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
This converts a checkout conflict exception into a RebaseResult /
MergeResult containing the conflicting paths, which enables EGit (or
others) to handle the situation in a user-friendly way
Change-Id: I48d9bdcc1e98095576513a54a225a42409f301f3
This is necessary because some versions of JGit containing
the flawed c98abc9c05 were
used in the wild and wrote bad configuration files. We now
must accept this value in addition to the preferred case.
Change-Id: I3ed5451735658df6381532499130e5186805024a
Previously, the FileObjDatabase required both the pack file path and
index file path to be passed to openPack(). A future change to add
a bitmap index will add a .bitmap file parallel to the pack file
(similar to the .idx file). Update the PackFile to support
automatically loading pack index extensions based on the pack file
path.
Change-Id: Ifc8fc3e57f4afa177ba5a88df87334dbfa799f01
Previously, the DfsObjDatabase had a hardcoded getPackFile() and
getPackIndex() methods which opens a .pack and .idx file, respectively.
A future change to add a bitmap index will need to be stored in a
parallel .bitmap file. Update the DfsObjDatabase to support opening and
writing of files for any pack extension.
Change-Id: I7c403b501e242096a2d435f6865d6025a9f86108
Once we start talking about parents of tags, we are in the commit
graph, so treat all objects from this point as commits. This fixes
spurious IncorrectObjectTypeExceptions on resolving expressions like
tag^^.
Change-Id: I29ece1fdb49c9c5b9ca415efcd1876bc72e97120