PackWriterBitmapPreparer (which is in another package) is already well
aware of the mapping between EWAHCompressedBitmaps and the
higher-level CompressedBitmap objects of the BitmapIndexImpl API.
Making the CompressedBitmap type public makes the translation more
obvious and wouldn't break any abstractions that aren't already
broken. So expose it.
This is all under org.eclipse.jgit.internal so there are no API
stability guarantees --- we can change the API if internals change
(for example if some day there are bitmaps spanning multiple packs).
In particular this means the confusing toBitmap helper can be removed.
Change-Id: Ifb2e8804a6d5ee46e82a76d276c4f8507eaa2a4c
Update the project-specific Eclipse settings to replace the use of the
org.eclipse.jdt.annotation.Nullable class the new JGit-specific
@Nullable annotation. I verified that Eclipse reports errors when the
return value of a method annotated with
@org.eclipse.jgit.annotations.Nullable is dereferenced without a null
check.
Also remove the Maven and MANIFEST.MF dependencies on
org.eclipse.jdt.annotation.
Eclipse null analysis uses three annotations: @Nullable, @NonNull and
@NonNullByDefault. All three are updated in this patch because it is
invalid to set the Eclipse preferences to empty values. So far only
@Nullable has been introduced in org.eclipse.jgit.annotations.
My personal preference is to follow the advice in Effective Java and
avoid the null-return idiom, and to avoid passing null values in
general. This sets the expectation is that arguments and return types
are assumed non-null unless otherwise documented. If that is the
expectation, then consistent application of @NonNull is redundant and
hurts readability by cluttering the code, obscuring the occasional
@Nullable annotation that really requires attention.
If the JGit community decides there is value in using the @NonNull and
@NonNullByDefault annotations we can add them--this change configures
Eclipse to use them.
Change-Id: I9af1b786d1b44b9b0d9c609480dc842df79bf698
Signed-off-by: Terry Parker <tparker@google.com>
Update existing @Nullable uses to use the JGit-internal version of the
annotation.
Change-Id: I95234ffad4c3110f9597fbd3a2760f653e22ecf7
Signed-off-by: Terry Parker <tparker@google.com>
Commit 847b3d1 enabled annotation-based NPE analysis to JGit.
In so doing, it introduced a new dependency on the org.eclipse.jdt that
is undesirable. Follow Gerrit's lead by adding an internal Nullable type
(see
https://gerrit.googlesource.com/gerrit/+/stable-2.11/gerrit-common/src/main/java/com/google/gerrit/common/Nullable.java).
The javax.annotations.Nullable class uses a retention policy of RUNTIME,
whereas the org.eclipse.jdt.annotation.Nullable class used a policy of
CLASS. Since I'm not aware of tools that make use of the annotation at
runtime I chose the CLASS policy. If in the future there is a benefit to
retaining the annotations in the running binary we can update this
class.
Change-Id: I63dc8f9a6dc46b517cbde211bb5e2f8521a54d04
Signed-off-by: Terry Parker <tparker@google.com>
A CompressedBitmap represents a pair (EWAH bit vector, PackIndex
assigning bit positions to git objects). The bit vector is a member
field and the PackIndex is implicit via the 'this' reference to the
outer class.
Make this clearer by making CompressedBitmap a static class and
replacing the 'this' reference by an explicit field.
Likewise for CompressedBitmapBuilder.
Change-Id: Id85659fc4fc3ad82034db3370cce4cdbe0c5492c
Suggested-by: Terry Parker <tparker@google.com>
When creating bitmaps during gc, the bitmaps in tipCommitBitmaps are
built in setupTipCommitBitmaps using the following procedure:
0. create a bitmap ('reuse') that lists all ancestors of commits
whose existing bitmaps will be reused. I will call this the
reused part of history.
1. initialize a bitmap for each of the pack's "want"s by taking
a copy of the 'reuse' bitmap and setting the bit corresponding
to the one wanted commit.
2. walk through ancestors of wants, excluding the reused part of
history. Add parents of visited commits to bitmaps that have
those commits.
3. AND-NOT each tipCommitBitmap against the 'reuse' bitmap
4. Sort the bitmaps and AND-NOT each against the previous so they
partition the new commits.
The OR against 'reuse' in step 1 and the AND-NOT against 'reuse'
cancel each other out, except when commits from the reused part of
history are added to a bitmap in step 2. So avoid adding commits from
the reused part of history in step 2 and skip the OR and AND-NOT.
Performance impact (thanks to Terry for measuring):
The initial "selecting bitmaps" phase of garbage collection decreased
from (83 + 81 + 85) / 3 = 83 to (56 + 57 + 56) / 3 = 56.3, meaning
nearly a ~50% speedup of that phase.
Tested-by: Terry Parker <tparker@google.com>
Change-Id: I26ea695809594347575d14a1d8e6721b8608eb9c
Instead of using the newRevFilter helper, call the appropriate
RevFilter constructor directly. This means one less hop to find
documentation about what the RevFilter will do.
Change-Id: Ida6ff1c0457a47a1bd1e4ed0fd1dd42a616d214f
When setupTipCommitBitmaps is called, writeBitmaps does not have any
bitmaps saved, so these calls to .add always add a single commit and
do not OR in a bitmap.
The objects returned by nextObject after a commit walk is finished
are trees and blobs. Non-commit objects do not have bitmaps
associated so the call to .add also can only add a single object.
Change-Id: I7b816145251a7fa4f1fffe0d03644257ea7f5440
* changes:
Remove BitmapRevFilter.getCountOfLoadedCommits
Make BitmapBuilder.getBitmapIndex public
Deprecate BitmapBuilder.add and introduce simpler addObject method
Add @Override annotations to BitmapIndexImpl
Rely on bitmap RevFilter to filter walk during bitmap selection
Use 'reused' bitmap to filter walk during bitmap selection
Rely on bitmap RevFilter to filter tip commit setup walk
Use 'reused' bitmap to filter tip commit setup walk
Include ancestors of reused bitmap commits in reuse bitmap again
This is the caller that the BitmapBuilder.add method was designed
around. Moving away from .add makes it more verbose but hopefully
clearer.
Change-Id: I57b1d7c1dc8fb800b242b76c606922b5aa36b9b2
This puts the code for include() in each RevFilter returned by
newRevFilter in one place and should make the code easier to
understand and modify.
Change-Id: I590ac4604b61fc1ffe7aba2ed89f8139847ccac3
The count of loaded commits is equal to the number of commits returned
by the walk. Simplify BitmapRevFilter by counting them in the caller.
Change-Id: Ia95da47831d9e89d6f8068470ec4529aaabfa7dd
Every Bitmap in current JGit code has an associated BitmapIndex. Make
it public in BitmapBuilder to make retrieving bitmaps to OR in from
that index easier.
Change-Id: I2773aa94d8b67f12194608e6317c0792a5de21e2
The BitmapIndex.BitmapBuilder.add API is subtle:
/**
* Adds the id and the existing bitmap for the id, if one
* exists, to the bitmap.
*
* @return true if the value was not contained or able to be
* loaded.
*/
boolean add(AnyObjectId objectId, int type);
Reading the name of the method does not make it obvious what it will
do. Does it add the named object to the bitmap, or all objects
reachable from it? It depends on whether the BitmapIndex owns an
existing bitmap for that object. I did not notice this subtlety when
skimming the javadoc, either. This resulted in enough confusion to
subtly break the bitmap building code (see change
I30844134bfde0cbabdfaab884c84b9809dd8bdb8 for details).
So discourage use of the add() API by deprecating it.
To replace it, provide a addObject() method that adds a single object.
This way, callers can decide whether to use addObject() or or() based
on the context.
For example,
if (bitmap.add(c, OBJ_COMMIT)) {
for (RevCommit p : c.getParents()) {
rememberToAlsoHandle(p);
}
}
can be replaced with
if (bitmap.contains(c)) {
// already included
} else if (index.getBitmap(c) != null) {
bitmap.or(index.getBitmap(c));
} else {
bitmap.addObject(c, OBJ_COMMIT);
for (RevCommit p : c.getParents()) {
rememberToAlsoHandle(p);
}
}
which is more verbose but makes it clearer that the behavior
depends on the content of index.getBitmaps().
Change-Id: Ib745645f187e1b1483f8587e99501dfcb7b867a5
This makes it easier to distinguish between implementations of methods
from the interface from helpers internal to org.eclipse.jgit.internal.storage.*.
This was illegal in Java 5 but JGit requires newer Java these days.
Change-Id: I92c65f3407a334acddd32ec9e09ab7d1d39c4dc6
This RevWalk filters out reused bitmap commits via the 'reuse' bitmap.
Avoid possible wasted time and complexity by not also redundantly
marking them UNINTERESTING.
Change-Id: Ibb714002ddac599963d148a9aab90645fcc73141
When building fullBitmap in order to determine which ancestor chain to
add this commit to, we were excluding the ancestors of reusedCommits
using markUninteresting. This use of markUninteresting is a bit
wasteful because we already have a bitmap indicating exactly which
commits should be excluded (which can save some walking). Use it.
A separate commit will remove the now-redundant markUninteresting
call.
No behavior change intended (except for performance improvement).
Change-Id: I1112641852d72aa05c9a8bd08a552c70342ccedb
This RevWalk filters out reused bitmap commits via the 'reuse' bitmap.
Avoid possible wasted time and complexity by not redundantly marking
them UNINTERESTING any more.
Change-Id: If467ccd1d75e17cf9367b2a0399fca3f9d52adf9
When garbage collecting, we decide to reuse some bitmaps in older
history from the previous pack to save time. The remainder of commit
selection only involves commits not covered by those bitmaps.
Currently we carry that out in two ways:
1. by building a bitmap representing the already-covered commits,
for easy containment checks and AND-NOT-ing against
2. by marking the reused bitmap commits as uninteresting in the
RevWalk that finds new commits
The mechanism in (2) is less efficient than (1): rw.next() will walk
back from reused bitmap commits to check whether the commit it is
about to emit is an ancestor of them, when using the bitmap from (1)
would let us perform the same check with a single contains() call.
Add a RevFilter teaching the RevWalk to perform that same check
directly using the bitmap from (1).
The next time the RevWalk is used, a different RevFilter is installed
so this does not break that.
A later commit will drop the markUninteresting calls.
No functional change intended except a possible speedup.
Change-Id: Ic375210fa69330948be488bf9dbbe4cb0d069ae6
Prior to this change, DfsInserter would not insert an object into a pack
if it already existed in another pack in the repository, even if that
pack was unreachable. Consider this sequence of events:
- Object FOO is pushed to a repository.
- Subsequent ref changes make FOO UNREACHABLE_GARBAGE.
- FOO is subsequently re-inserted using a DfsInserter, but skipped
due to existing in UNREACHABLE_GARBAGE.
- The repository is repacked; FOO will not be written into a new pack
because it is not yet reachable from a reference. If the
UNREACHABLE_GARBAGE packs are deleted, FOO disappears.
- A reference is updated to reference FOO. This reference is now broken
as FOO was removed when the repacking process deleted the
UNREACHABLE_GARBAGE pack that stored the only copy of FOO.
The garbage collector can't safely delete the UNREACHABLE_GARBAGE
pack because FOO might be in the middle of being re-inserted/re-packed.
This change writes a duplicate copy of an object if it only exists in
UNREACHABLE_GARBAGE. This "freshens" the object to give it a chance to
survive long enough to be made reachable through a reference.
Change-Id: I20f2062230f3af3bccd6f21d3b7342f1152a5532
Signed-off-by: Mike Williams <miwilliams@google.com>
Until 320a4142 (Update bitmap selection throttling to fully span
active branches, 2015-10-20), setupTipCommitBitmaps contained code
along the following lines:
for (PackBitmapIndexRemapper.Entry entry : bitmapRemapper) {
if (!reuse(entry))
continue;
RevCommit rc = (RevCommit) rw.peel(rw.parseAny(entry));
reuseCommits.add(new BitmapCommit(rc));
EWAHCompressedBitmap bitmap =
remap.ofObjectType(remap.getBitmap(rc), OBJ_COMMIT);
writeBitmaps.addBitmap(rc, bitmap, 0);
reuse.add(rc, OBJ_COMMIT);
}
writeBitmaps.clearBitmaps(); // Remove temporary bitmaps
This loop OR-ed together bitmaps for commits whose bitmaps would be
reused. A subtle point is the use of the add() method, which ORs in a
bitmap from the BitmapIndex when it exists and falls back to OR-ing in
a single bit when that bitmap does not exist in the BitmapIndex.
Commit 320a4142 removed the addBitmap step, so the bitmap does not
exist in the BitmapIndex and the fallback behavior is triggered.
Simplify and restore the intended behavior by avoiding use of the
subtle use of the add() method --- use or() directly instead.
Change-Id: I30844134bfde0cbabdfaab884c84b9809dd8bdb8
When packing is able to reuse lots of deltas from existing packs, those
objects are marked as "doNotAttemptDelta" and do not contribute to
DeltaTask's computeTopPaths() "totalWeight" calculation.
In the extreme case when all packs are reusable, "totalWeight" will be
zero. DeltaTask.partitionTasks() uses "totalWeight" to determine a
"weightPerThread" size it uses to set up DeltaTasks. When "totalWeight"
is small, partitionTasks() ends up creating a DeltaTask for every
unique path.
For a large repository, the small "weightPerThread" can result in the
creation of >100k tasks (for the MSM 3.10 Linux repository, the count
was ~150k). This makes the "task stealing" mechanism in DeltaTask very
inefficient, because every attempt to steal work does a linear walk
through all tasks, searching for the one with the most work remaining,
which is O(N^2) comparisons. For the MSM 3.10 repository when all
deltas were reusable, PackWriter.parallelDeltaSearch() took
(1615+1633+1458)/3 = 1568 seconds.
The error is that DeltaTask treats the weights of objects marked as
"doNotAttemptDelta" inconsistently. It ignores the weights when
calculating "totalWeight" but uses them when partitioning the tasks.
The fix is to also ignore them when partitioning the tasks.
With this patch applied, PackWriter.parallelDeltaSearch() on the
MSM 3.10 repository when all deltas are reused went from taking
1568 seconds to 62ms (>25k speedup).
This patch also fixes a totalWeight initialization error in
DeltaTask.computeTopPaths().
Change-Id: I2ae37efa83bca42b0e716266ae6aa9d182e76d9c
Signed-off-by: Terry Parker <tparker@google.com>
Avoid leaving the reader in suspense by handling the unusual
(!RevCommit) case first. As a nice side effect, there is less nesting
to keep track of in the rest of the loop body.
No functional change intended.
Change-Id: I1580de444fccde08070f696218c12041151a924a
When the file <git-dir>/hooks/pre-push exists make sure that is is
executing during a push. The pre-push hook runs during git push, after
the remote refs have been updated but before any objects have been
transferred.
Change-Id: Ibbb58ee3227742d1a2f913134ce11e7a135c7f4c
In order to support filters in gitattributes FS.runProcess() is made
public. Support for stdin redirection has been added. Support for binary
data on stdin/stdout (as used be clean/smudge filters) has been added.
Change-Id: Ice2c152e9391368dc5748d7b825a838e3eb755f9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fixed random errors in discoverGitSystemConfig() on Linux where the
process error stream was closed by readPipe() before or while
GobblerThread was reading from it.
Marked readPipe() as @Nullable and fixed potential NPE in
discoverGitSystemConfig() on readPipe() return value.
Fixed process error output randomly mixed with other threads log
messages.
Change-Id: Id882af2762cfb75f010f693b2e1c46eb6968ee82
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Add comments and rename variables in PackWriterBitmapPreparer to improve
readability.
Change-Id: I49e7a1c35307298f7bc32ebfc46ab08e94290125
Signed-off-by: Terry Parker <tparker@google.com>
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Change-Id: I94fb481b68c84841c1db1a5ebe678b13e13c962b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Change-Id: Ie7b65f251ec4452d5a5ed48aa0f272cf49a9aecd
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Change-Id: I0ebaeb2bc454cd8051b901addb102c1a6688688b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Change-Id: If53ec94145bae47b74e2561305afe6098012715c
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Expose the following bitmap selection parameters via PackConfig:
"bitmapContiguousCommitCount", "bitmapRecentCommitCount",
"bitmapRecentCommitSpan", "bitmapDistantCommitSpan",
"bitmapExcessiveBranchCount", and "bitmapInactiveBranchAge".
The value of bitmapContiguousCommitCount, whereby bitmaps are
created for the most recent N commits in a branch, has never
been verified. If experiments show that they are not valuable,
then we can simplify the implementation so that there is only
a concept of recent and distant commit history (defined by
"bitmapRecentCommitCount"), and the only controls we need are
"bitmapRecentCommitSpan" and "bitmapDistantCommitSpan".
Change-Id: I288bf3f97d6fbfdfcd5dde2699eff433a7307fb9
Signed-off-by: Terry Parker <tparker@google.com>
Replace the “bitmapCommitRange” parameter that was recently introduced
with two new parameters: “bitmapExcessiveBranchCount” and
“bitmapInactiveBranchAgeInDays”. If the count of branches does not
exceed “bitmapExcessiveBranchCount”, then the current algorithm is kept
for all branches.
If the branch count is excessive, then the commit time for the tip
commit for each branch is used to determine if a branch is “inactive”.
"Active" branches get full commit selection using the existing
algorithm. "Inactive" branches get fewer bitmaps near the branch tips.
Introduce a "contiguousCommitCount" parameter that always enforces that
the N most recent commits in a branch are selected for bitmaps. The
previous nextSelectionDistance() algorithm created anywhere from 1-100
contiguous bitmaps at branch tips.
For example, consider a branch with commits numbering 0-300, with 0
being the most recent commit. If the most recent 200 commits are not
merge commits and the 200th commit was the last one selected,
nextSelectionDistance() returned 100, causing commits 200-101 to be
ignored. Then a window of size 100 was evaluated, searching for merge
commits. Since no merge commits are found, the next commit (commit 0)
was selected, for a total of 1 commit in the topmost 100 commits.
If instead the 250th commit was selected, then by the same logic
commit 50 is selected. At that point nextSelectionDistance() switches to
selecting consecutive commits, so commits 0-50 in the topmost 100
commits are selected. The "contiguousCommitCount" parameter provides
more determinism by always selecting a constant number or topmost
commits.
Add an optimization to break out of the inner loop of selectCommits() if
all of the commits for the current branch have already been found.
When reusing bitmaps from an existing pack, remove unnecessary
populating and clearing of the writeBitmaps/PackBitmapIndexBuilder.
Add comments to PackWriterBitmapPreparer, rename methods and variables
for readability.
Add tests for bitmap selection with and without merge commits and with
excessive branch pruning triggered.
Note: I will follow up with an additional change that exposes the new
parameters through PackConfig.
Change-Id: I5ccbb96c8849f331c302d9f7840e05f9650c4608
Signed-off-by: Terry Parker <tparker@google.com>
Since Git 2.6 wildcard restrictions for refspecs have been loosened:
refspecs like "refs/heads/*foo:refs/heads/foo*" are valid now.
See Git commit 8d3981ccbed9fc211b4e67105015179d9d2a5692
Change-Id: Icb78afbd282c425173b3a7bc10eadc4015689bb8
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Building on top of https://git.eclipse.org/r/#/c/56391/
Here we preserve compatibility with JetS3t
and add 2 new native JGit encryption implementations.
For reference, see connection configuration files:
* Version 0: jgit-s3-connection-v-0.properties
* Version 1: jgit-s3-connection-v-1.properties
* Version 2: jgit-s3-connection-v-2.properties
Change-Id: I713290bcacbe92d88e5ef28ce137de73dd1abe2f
Signed-off-by: Andrei Pozolotin <andrei.pozolotin@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
See previous attempt: https://git.eclipse.org/r/#/c/16674/
Here we preserve as much of JetS3t mode as possible
while allowing to use new Java 8+ PBE algorithms
such as PBEWithHmacSHA512AndAES_256
Summary of changes:
* change pom.xml to control long tests
* add WalkEncryptionTest.launch to run long tests
* add AmazonS3.Keys to to normalize use of constants
* change WalkEncryption to support AES in JetS3t mode
* add WalkEncryptionTest to test remote encryption pipeline
* add support for CI configuration for live Amazon S3 testing
* add log4j based logging for tests in both Eclipse and Maven build
To test locally, check out the review branch, then:
* create amazon test configuration file
* located your home dir: ${user.home}
* named jgit-s3-config.properties
* file format follows AmazonS3 connection settings file:
accesskey = your-amazon-access-key
secretkey = your-amazon-secret-key
test.bucket = your-bucket-for-testing
* finally:
* run in Eclipse: WalkEncryptionTest.launch
* or
* run in Shell: mvn test --define test=WalkEncryptionTest
Change-Id: I6f455fd9fb4eac261ca73d0bec6a4e7dae9f2e91
Signed-off-by: Andrei Pozolotin <andrei.pozolotin@gmail.com>
If the checkout path is currently a non-empty directory (and was a link
or a regular file before), this directory will be removed before
performing checkout, but only if the checkout path is specified.
Bug: 474973
Change-Id: Ifc6c61592d9b54d26c66367163acdebea369145c
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
A bitmap index contains bitmaps for a set of commits in a pack file.
Creating a bitmap for every commit is too expensive, so heuristics
select the most "important" commits. The most recent commits are the
most valuable. To clone a repository only those for the branch tips are
needed. When fetching, only commits since the last fetch are needed.
The commit selection heuristics generally work, but for some
repositories the number of selected commits is prohibitively high. One
example is the MSM 3.10 Linux kernel. With over 1 million commits on
2820 branches, the current heuristics resulted in +36k selected commits.
Each uncompressed bitmap for that repository is ~413k, making it
difficult to complete a GC operation in available memory.
The benefit of creating bitmaps over the entire history of a repository
like the MSM 3.10 Linux kernel isn't clear. For that repository, most
history for the last year appears to be in the last 100k commits.
Limiting bitmap commit selection to just those commits reduces the count
of selected commits from ~36k to ~10.5k. Dropping bitmaps for older
commits does not affect object counting times for clones or for fetches
on clients that are reasonably up-to-date.
This patch defines a new "bitmapCommitRange" PackConfig parameter to
limit the commit selection process when building bitmaps. The range
starts with the most recent commit and walks backwards. A range of 10k
considers only the 10000 most recent commits. A range of zero creates
bitmaps only for branch tips. A range of -1 (the default) does not limit
the range--all commits in the pack are used in the commit selection
process.
Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb
Signed-off-by: Terry Parker <tparker@google.com>
On a server also running Gerrit that is using RepoCommand to
convert from an XML manifest to a git submodule superproject
periodically, it would be handy to be able to use Gerrit's
submodule subscription feature[1] to update the superproject
automatically between RepoCommand runs as changes are merged
in each subprojects.
This requires setting the 'branch' field for each submodule
so that Gerrit knows what branch to watch. Add an option to
do that.
Setting the branch field also is useful for plain Git users,
since it allows them to use "git submodule update --remote" to
manually update all submodules between RepoCommand runs.
[1] https://gerrit-review.googlesource.com/Documentation/user-submodules.html
Change-Id: I1a10861bcd0df3b3673fc2d481c8129b2bdac5f9
Signed-off-by: Stefan Beller <sbeller@google.com>
* stable-4.1:
pgm: Open RevWalk and TreeWalk in try-with-resource
ant: Open Repository and Git in try-with-resource
pgm: Create instances of Git in try-with-resource
FanoutBucket: Create ObjectInserter.Formatter in try-with-resource
Fix compiler warnings in DiffFormatter.writeGitLinkText
Change-Id: I448ecc9a1334977d9f304dd61ea20c7a8e692b10
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Clirr doesn't support Java 8 hence use japicmp instead.
See https://github.com/siom79/japicmp
Change-Id: If4b30a6d6aa849b4d6b3b0c900558c609822840c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This patch makes JGit parsing of ignore rules containing unmatched '['
bracket compatible to the Git CLI.
Since '[' starts character group, Git tries to parse the ignore rule as
a shell glob pattern and if the character group is not closed, the glob
pattern is invalid and so the ignore rule never matches anything.
See also http://article.gmane.org/gmane.comp.version-control.git/278699.
Bug: 478490
Change-Id: I734a4d14fcdd721070e3f75d57e33c2c0700d503
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
- Remove declaration of IOException that is no longer thrown
- Add missing //$NON-NLS-1$ to prevent "Non-externalized string literal"
warning.
These warnings seem to have been introduced by If13f7b406.
Change-Id: I30058eed31b92067a6ab22e787732b08e29f8d63
Signed-off-by: David Pursehouse <david.pursehouse@sonymobile.com>
Since 4.0 we require Java 7 so there is no longer a need to override the
following methods in FS_POSIX, FS_Win32, FS_Win32_Cygwin
- lastModified()
- setLastModified()
- length()
- isSymlink()
- exists()
- isDirectory()
- isFile()
- isHidden()
Hence implement these methods in FS and remove overrides in subclasses.
Change-Id: I5dbde6ec806c66c86ac542978918361461021294
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
As discussed on https://git.eclipse.org/r/53836 it does not make sense
to have two similar utility classes in same package with intersecting
functionality. To not break the API, all methods from FileUtil are
copied to FileUtils, all FileUtil API is made deprecated and redirecting
now to FileUtils. Moved simple methods which are available in Java 7 API
are made package private and can be removed at any point later entirely
(right now they are in use).
Bug: 475070
Change-Id: Idffcf9840496c448173af7c052d8898ada68e27b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
SystemReader.isMacOs() and SystemReader.isWindows() return values are
unlikely to change during the JVM lifetime (except tests). Don't read
system properties each time the methods are called, just use previously
calculated value.
Change-Id: I495521f67a8b544e7b7247d99bbd05a42ea16d20
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
An attempt to re-implement not well documented Git CLI behavior for
patterns with backslashes.
It looks like Git silently ignores all \ characters in ignore rules, if
they are NOT covered by 3 cases described in [1]:
{quote}
1) ... Put a backslash ("\") in front of the first hash for patterns
that begin with a hash.
...
2) Trailing spaces are ignored unless they are quoted with backslash
("\").
...
3) Put a backslash ("\") in front of the first "!" for patterns that
begin with a literal "!", for example, "\!important!.txt".
{quote}
Undocumented also is the fact that backslash itself can be escaped by
backslash.
So \h\e\l\l\o\.t\x\t rule matches hello.txt and a\\\\b a\b in Git CLI.
Additionally, the glob parser [2] knows special meaning of backslash:
{quote}
One can remove the special meaning of '?', '*' and '[' by preceding
them by a backslash, or, in case this is part of a shell command
line, enclosing them in quotes. Between brackets these characters
stand for themselves. Thus, "[[?*\]" matches the four characters
'[', '?', '*' and '\'.
{quote}
[1] https://www.kernel.org/pub/software/scm/git/docs/gitignore.html
[2] http://man7.org/linux/man-pages/man7/glob.7.html
Bug: 478065
Change-Id: I3dc973475d1943c5622103701fa8cb3ea0684e3e
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Currently we fail to properly recognize character group if the pattern
before character group contains opening bracket.
See comment from Sebastien Arod on https://git.eclipse.org/r/56678/
Change-Id: I70d3657a2a328818ea2bdc1409d18ecb3a85825b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Current DiffFormat behavior regarding submodules (aka git links) is
incorrect. The "Subproject commit <sha1>" appears as part of the diff
header, rather than as its own hunk.
--> From JGit implementation
diff --git a/plugins/cookbook-plugin b/plugins/cookbook-plugin
index b9d3ca8..ec6ed89 160000
--- a/plugins/cookbook-plugin
+++ b/plugins/cookbook-plugin
-Subproject commit b9d3ca8a65030071e28be19296ba867ab424fbbf
+Subproject commit ec6ed89c47ba7223f82d9cb512926a6c5081343e
--> From C Git 2.5.2
diff --git a/plugins/cookbook-plugin b/plugins/cookbook-plugin
index b9d3ca8..ec6ed89 160000
--- a/plugins/cookbook-plugin
+++ b/plugins/cookbook-plugin
@@ -1 +1 @@
-Subproject commit b9d3ca8a65030071e28be19296ba867ab424fbbf
+Subproject commit ec6ed89c47ba7223f82d9cb512926a6c5081343e
The current way of processing submodules results in no hunk header and
includes the contents of the hunk as part of the headers. To fix this, we
can't just have our writeGitLinkDiffText output the hunk header. We have
to change the flow so that the raw text gets parsed as a diff. The easiest
way to do this is to fake the RawText in the FormatResult when we have a
GITLINK.
It should be noted that it seems possible for there to be a difference
between a GITLINK and a non-GITLINK, but I don't think this can happen in
practice, so I don't think we need to worry too much about it.
This patch also fixes up the test for GitLink headers, as the test was
for the old behavior. My setup has 3 other failing tests which may or
may not be the result of environmental changes. However, the same tests
fail without this commit, so I do not believe they are related.
Bug: 477759
Change-Id: If13f7b406904fad814416c93ed09ea47ef183337
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Ignore rules should escape $^(){}+| chars if using regular expressions,
because they should be treated literally if they aren't part of a
character group.
Bug: 478055
Change-Id: Ic7276442d7f8f02594b85eae1ef697362e62d3bd
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
If a client mistakenly tries to send a tag object as a shallow line
JGit blindly assumes this is a commit and tries to parse the tag
buffer using the commit parser. This can cause an obtuse error like:
InvalidObjectIdException: Invalid id: t c0ff331234...
The "t" comes from the "object c0ff331234..." line of the tag tring
to be parsed as though it where the "tree" line of a commit.
Run any client supplied shallow lines through the RevWalk to lookup
the object types. Fail fast with a protocol exception if any of them
are non-commit.
Skip objects not known to this repository. This matches behavior
with git-core's upload-pack, which sliently skips over any shallow
line object named by the client but not known by the server.
Change-Id: Ic6c57a90a42813164ce65c2244705fc42e84d700
Properties.containsKey() is the correct call here; contains() was testing
if a value is present but the key is what was meant.
Change-Id: Ice72c9f4388583e18cf8aca6e837cc4299fd07fd
When we have a URI that contains an empty path component (that is
it only contains a "/") we want to fall back to the host as
humanish name.
This change is according to the behavior of upstream git, which
falls back on the hostname when guessing directory names for
newly cloned repositories (see [1] for the discussion).
[1] http://article.gmane.org/gmane.comp.version-control.git/274669
Change-Id: I44400c6ab72a2722d2155d53d63671bd867d6c44
Signed-off-by: Patrick Steinhardt <ps@pks.im>
If no refs match the input list and we are writing to a batch,
the returned new commit from write() will match the current commit.
Adding a command to the batch for this case is harmless as it will
succeed, but it's more straightforward to just skip adding a command
in this case.
Add tests or the combination of saving matching refs and saving to a
batch.
Change-Id: I6837389b08e6c80bc2d4c9e9c506d07293ea5fb2
This header was removed unintentionally from some bundles in
3a4a5a4e57. Restore it to ensure lazy
activation of bundles.
Change-Id: I1f841f978fb93278e3ec0533a01f1363510dd976
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In Bug 476164 it was reported that EGit doesn't start when the platform
comes with jsch 0.1.51 while this version of EGit/JGit brings jsch
0.1.53. This could be caused by outdated uses-clauses. Hence recompute
them using PDE tooling.
Bug: 476164
Change-Id: I185ba097884ead9cd034eba842bd3bf34181a99b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
It was reported in [1] that 197e3393a5 led
to a performance regression in a BFG benchmark. Analysis showed that
this is caused by the exists() method in FS_POSIX, now overriding the
default implementation in FS. The default implementation of FS.exists()
uses java.io.File.exists(), while the new implementation in FS_POSIX
uses java.nio.file.Files.exists() - by simply removing the override in
FS_POSIX, performance was restored.
Profiling showed that java.nio.file.Files.exists() is substantially
slower than java.io.File.exists(), to the point where the exists() call
doubles the average cost of a call to
ObjectDirectory.insertUnpackedObject() - which the BFG uses a lot,
because it's rewriting history. Average times measured on Ubuntu were:
java.io.File.exists() - 4 microseconds
java.nio.file.Files.exists() - 60 microseconds
The loose object exists test should be using java.io.File and not FS.
ObjectDirectory uses FS.resolve() to traverse symlinks to objects but
then once inside objects all 256 sharded directories should be real
directories, and the object files should be real files, not dangling
symlinks. java.io.File.exists() is sufficient here, and faster.
Change ObjectDirectory to use File.exists() once its computed the File
handle.
This does mean JGit cannot run ObjectDirectory code on an abstract
virtual filesystem plugged into NIO2. If you really want to run JGit on
an esoteric non-standard filesystem like "in memory" you should look at
the DFS storage backend, which has fewer abstraction points to deal
with. Or write your own from scratch.
[1] https://dev.eclipse.org/mhonarc/lists/jgit-dev/msg02954.html
Change-Id: I74684dc3957ae1ca52a7097f83a6c420aa24310f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
- add @since annotation for new API method
- silence non-externalized String warning
Change-Id: I864176ced64e9569e7f2cdf8f777720655bfc578
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
On a local filesystem the packed-refs file will be orphaned if it is
replaced by another client while the current client is reading the old
one. However, since NFS servers do not keep track of open files, instead
of orphaning the old packed-refs file, such a replacement will cause the
old file to be garbage collected instead. A stale file handle exception
will be raised on NFS servers if the file is garbage collected (deleted)
on the server while it is being read. Since we no longer have access to
the old file in these cases, the previous code would just fail. However,
in these cases, reopening the file and rereading it will succeed (since
it will reopen the new replacement file). So retrying the read is a
viable strategy to deal with stale file handles on the packed-refs file,
implement such a strategy.
Since it is possible that the packed-refs file could be replaced again
while rereading it (multiple consecutive updates can easily occur with
ref deletions), loop on stale file handle exceptions, up to 5 extra
times, trying to read the packed-refs file again, until we either read
the new file, or find that the file no longer exists. The limit of 5 is
arbitrary, and provides a safe upper bounds to prevent infinite loops
consuming resources in a potential unforeseen persistent error
condition.
Change-Id: I085c472bafa6e2f32f610a33ddc8368bb4ab1814
Signed-off-by: Martin Fick<mfick@codeaurora.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add a public API to the FileUtils to determine if an IOException is a
stale NFS file handle exception. This will make it easier to detect
such errors, and interpret them consistently throughout the codebase.
This new API is a bit more lenient in its detection than the previous
detection, and should be able to detect some errors which previously
were not identified as stale file handle exceptions because they had the
word NFS in the error message. Adjust the packfile handling code to use
this new API for detection.
Change-Id: I21f80014546ba1afec7335890e5ae79e7f521412
Signed-off-by: Martin Fick<mfick@codeaurora.org>
Signaling the need to flush() only via the interrupted status of a
copying thread doesn't work realiably with jsch. The write() method of
com.jcraft.jsch.Session catches the InterruptedException in several
places. As a result StreamCopyThread can easily miss the interrupt if it
was interrupted during the dst.write() or dst.flush() call. When it
happens, StreamCopyThread will not send some data to the remote side and
will not get the response back, because remote side will wait for more
data from us.
The flushCount field incremented during flush() method guarantees we
don't miss flush() even if jsch catches InterruptedException in
dst.write() or dst.flush() calls.
Checking the flushCount after dst.write() is needed because dst.write()
can clear interrupt status, in this case the next blocking src.read()
won't throw an exception and we will not call flush().
Flush is performed only after src.read() was blocked and thrown an
InterruptedIOException exception, this guarantees that we flush all the
data available in src so far (src.read() doesn't block while more is
available).
FlushCount is reset to 0 only when there were no flush() calls since
last blocked read, that means we flushed all data available in src. If
there were flush() calls, the interrupt status is restored, so next
blocked read will throw InterruptedException and we will flush()
again.
Change-Id: I692b226edaff502f06235ec05da9052b5fe6478a
Signed-off-by: Dmitry Neverov <dmitry.neverov@gmail.com>
There should be no functional change, the logic updated only to make
code simple so that compiler can understand what is going for. Removed
all @SuppressWarnings("null") annotations since they cannot be used if
"org.eclipse.jdt.core.compiler.problem.potentialNullReference" option is
set to the "error" level.
Bug: 470647
Change-Id: Ie93c249fa46e792198d362e531d5cbabaf41fdc4
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
- use NIO2's Files.move() to reimplement rename()
- provide a second method accepting CopyOptions which can be used to
request atomic move.
Change-Id: Ibcf722978e65745218a1ccda45344ca295911659
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Root commits are commits with zero parents. If a commmit has no
parents it is the first commit in the repository. In general the root
commits should be unique for any given project, as the first commit
will be created at a different time, by a different user with its own
message. These root commits can be used as a "fingerprint" to
identify disjoint histories.
Change-Id: Id891dbc1f17c816cea404569578bb7635ff85cdb
The FS class and the subclasses FS_POSIX assumed in the findHook()
method that every repository has a valid gitDir. But in tests when using
in-memory-repositories this is not true and this method was generating
NPEs. Change the method to return null if no repository directory can be
determined.
Change-Id: I38a4d36dc6452b5dacae3d0dbf562b569ca3c19b
DfsGarbageCollector asks PackWriter for the set of objects packed
after the bitmap index is written out. This is now null as it was
cleared to release memory. Instead use PackBitmapIndexBuilder to
build the set as it also has the objects.
Reduce memory in PackBitmapIndexBuilder by fully discarding the
ObjectToPack instances. This was the original intent of commit
4bb523475d ("PackWriter: shed memory while creating bitmaps")
but failed as the instances were still held live here.
Switch to BlockList instead of ObjectToPack[]. This allows the
JVM to allocate many smaller arrays instead of one contiguous
array with 5.2M reference pointers. In a tight heap the smaller
allocations are more feasible.
Reduce the initial EWAHCompressedBitmaps for the 4 type maps. On
average a typical repository is 30% commits, 30% trees and 30% blobs.
These bitmaps are typically very dense. PackWriter orders objects by
commit, tree, blob when writing the file so these should always be a
very dense run of 1s with some 0s before and after. So even the 1/3rd
allocation is likely too large, but the later trim() will reduce the
internal buffer anyway.
Change-Id: If0b80a31cb00894f1485ff8f53ef7ae5a759a046
After jgit moved to Java 7 there is no need in an extra
Java7BasicAttributes class. Also all fields of Attributes can be made
final now.
Change-Id: I0be6daf7758189b0eecc4e26294bd278ed8bf7a0
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Once bitmap creation begins the internal maps required for packing are
no longer necessary. On a repository with 5.2M objects this can save
more than 438 MiB of memory by allowing the ObjectToPack instances to
get garbage collected away.
Downside is the PackWriter cannot be used for any further opertions
except to write the bitmap index. This is an acceptable trade-off as
in practice nobody uses the PackWriter after the bitmaps are built.
Change-Id: Ibfaf84b22fa0590896a398ff659a91fcf03d7128
For construction performance each new EWAHBitmap is allocated at the
roughly worst-case size the bitmap would need if all of the words must
be literal and no run length compression is available. In practice
this is far larger than is required, wasting heap memory while the
bitmaps are computed.
Trim down each bitmap to its minimum required size. This copies the
internal array to a new smaller array, allowing the GC to reclaim the
prior larger array for reuse.
A single bitmap of 5.2M bits is only 79 KiB of memory without this
trim call but 15,000 such bitmaps is 1.1 GiB. Trimming can help fit
a larger number of bitmaps during processing.
Change-Id: I2bd19a786189db5b01c4c96f209b83de50e10c3b
The bitmap preparer only needs commit graph topology; it does not use
the message body. Allow the RevWalk to free the body after the commit
has been parsed to save memory.
Change-Id: I97d4a440c9fc313873fd224bd05b9d9e3dc575db
The WorkingTreeIterator.isEntryIgnored() should use originally requested
file mode while descending to the file tree root and checking ignore
rules. Original code asking isEntryIgnored() on a file was using
directory mode instead if the .gitignore was not located in the same
directory.
Bug: 473506
Change-Id: I9f16ba714c3ea9e6585e9c11623270dbdf4fb1df
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
When during Merge for a certain path OURS & BASE contains a file and
THEIRS contains a folder there was a bug in JGit leading to unnecessary
conflicts. This commit fixes it and adds a test for this situation.
Bug: 472693
Change-Id: I71fac5a6a2ef926c01adc266c6f9b3275e870129
Also-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
According to [1] leading spaces are allowed in ignore rules and trailing
spaces are allowed too if they are escaped via backslash.
[1] https://www.kernel.org/pub/software/scm/git/docs/gitignore.html
Bug: 472762
Change-Id: I5e3ae5599cb9e5d80072f38c82c20cbc9475a18a
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Catch unexpected PatternSyntaxException and convert it to
InvalidPatternException. Log such errors, do not silently ignore them.
Bug: 463581
Change-Id: Id0936d9816769ec0cfae1898beda0f7a3c146e67
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
According to [1] backslash can escape leading special characters '#' and
'!' in ignore rules, so that they are treated literally.
[1] https://www.kernel.org/pub/software/scm/git/docs/gitignore.html
Bug: 463581
Change-Id: I4c02927413a9c63ea5dbf2954877080d902ec1b2
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
While checking if we should consider an ignore rule without '[]'
brackets as a regular expression, check if the backslash escapes one of
the glob special characters '?', '*', '[', '\\'. If not, backslash is
not a part of a regex and should be treated literally.
Bug: 463581
Change-Id: I85208c7f85246fbf6c5029ce3c8b7bb8f4dbd947
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Rather than lazily parsing the push in this method, parse it at the
end of recvCommands(), which already contains the necessary try/catch
for handling this error. This allows later callers to avoid having to
handle this condition superfluously.
Change-Id: I5dcaf1a44bf4e321adc281e3381e7e17ac89db06
Consider a BatchRefUpdate produced by Gerrit Code Review, where the
original command pushed over the wire might refer to
"refs/for/master", but that command is ignored and replaced with some
additional commands like creating "refs/changes/34/1234/1". We do not
want to store the cert in "refs/for/master@{cert}", since that may
lead someone looking to the ref to the incorrect conclusion that that
ref exists.
Add a separate put method that takes a collection of commands, and
only stores certs on those refs that have a matching command in the
cert.
Change-Id: I4661bfe2ead28a2883b33a4e3dfe579b3157d68a
Inspired by a proposal from gitolite[1], where we store a file in
a tree for each ref name, and the contents of the file is the latest
push cert to affect that ref.
The main modification from that proposal (other than lacking the
out-of-git batching) is to append "@{cert}" to filenames, which allows
storing certificates for both refs/foo and refs/foo/bar. Those
refnames cannot coexist at the same time in a repository, but we do
not want to discard the push certificate responsible for deleting the
ref, which we would have to do if refs/foo in the push cert tree
changed from a tree to a blob.
The "@{cert}" syntax is at least somewhat consistent with
gitrevisions(7) wherein @{...} describe operators on ref names.
As we cannot (currently) atomically update the push cert ref with the
refs that were updated, this operation is inherently racy. Kick the can
down the road by pushing this burden on callers.
[1] cf062b8bb6/contrib/hooks/repo-specific/save-push-signatures
Change-Id: Id3eb32416f969fba4b5e4d9c4b47053c564b0ccd
This will allow us to write the super project in a branch other than
master.
Change-Id: I578ed9ecbc6423416239e31ad644531dae9fb5c3
Signed-off-by: Yuxuan 'fishy' Wang <fishywang@google.com>
Primarily to aid in writing end-to-end tests, which must be written
outside this package as we don't currently have a Bouncy Castle
dependency.
Change-Id: Ia24179773140d9ecc616b3a6b2e630c85d800157
We have done this since forever with the "wanted old new ref" error,
so let's do it for other such errors thrown in the same block as well.
Change-Id: Ib3b1c7f05e31a5b3e40e85eb07b16736920a033b
When pushing to an HTTP server using the C git client, I observed a
certificate lacking a pushee field. Handle this gracefully in the
parser.
Change-Id: I7f3c5fa78f2e35172a93180036e679687415cac4
We intend to store received push certificates somewhere, like a
particular ref in the repository in question. For reading data back
out, it will be useful to read push certificates (without pkt-line
framing) in a streaming fashion.
Change-Id: I70de313b1ae463407b69505caee63e8f4e057ed4
Discussion on the git mailing list has concluded[1] that the intended
behavior for all (non-sideband) portions of the receive-pack protocol
is for trailing LFs in pkt-lines to be optional. Go back to using
PacketLineIn#readString() everywhere.
For push certificates specifically, we agreed that the payload signed
by the client is always concatenated with LFs even though the client
MAY omit LFs when framing the certificate for the wire. This is still
reflected in the implementation of PushCertificate#toText().
[1] http://thread.gmane.org/gmane.comp.version-control.git/273175/focus=273412
Change-Id: I817231c4d4defececb8722142fea18ff42e06e44
In most texts we use "cannot" hence instead of escaping the apostroph in
"can't" use "cannot".
Bug: 471796
Change-Id: Icda5b4db38076789d06498428909306aef3cb68b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
ObjectId.fromString already throws InvalidObjectIdException for most
malformed object ids, but for this kind it previously threw
IllegalArgumentException. Since InvalidObjectIdException is a child of
IllegalArgumentException, callers that catch IllegalArgumentException
will continue to work.
Change-Id: I24e1422d51607c86a1cb816a495703279e461f01
Signed-off-by: Jonathan Nieder <jrn@google.com>
The invalidId message in JGitText and the asAscii bad id both contain a
colon, so the resulting message would say
Invalid id: : a78987c98798ufa
Fix it by keeping the colon in the translated message and not adding
another colon programmatically.
Noticed by code inspection.
Change-Id: I13972eebde27a4128828e6c64517666f0ba6288b
Signed-off-by: Jonathan Nieder <jrn@google.com>
Add methods that allow to unregister repositories from the
RepositoryCache individually.
Bug: 470234
Change-Id: Ib918a634d829c9898072ae7bdeb22b099a32b1c9
Signed-off-by: Tobias Oberlies <tobias.oberlies@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
For loose objects an expiration date can be set which will save too
young objects from being deleted. Add the same for packfiles. Packfiles
which are too young are not deleted.
Bug: 468024
Change-Id: I3956411d19b47aaadc215dab360d57fa6c24635e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This may be used by e.g. a custom reflog implementation to record
this information along with the ref update.
Change-Id: I44adbfad704b76f9c1beced6e1ce82eaf71410d2
These differ subtly from a PersonIdent, because they can contain
anything that is a valid User ID passed to gpg --local-user. Upstream
git push --signed will just take the configuration value from
user.signingkey and pass that verbatim in both --local-user and the
pusher field of the certificate. This does not necessarily contain an
email address, which means the parsing implementation ends up being
substantially different from RawParseUtils.parsePersonIdent.
Nonetheless, we try hard to match PersonIdent behavior in
questionable cases.
Change-Id: I37714ce7372ccf554b24ddbff56aa61f0b19cbae
The signature is intended to be passed to a verification library such
as Bouncy Castle, which expects these lines to be present in order to
parse the signature.
Change-Id: I22097bead2746da5fc53419f79761cafd5c31c3b
This is the subclass of IOException already thrown by
BaseReceivePack#recvCommands when encountering an invalid value on
the wire. That's what PushCertificateParser is doing too, so use the
same subclass.
Change-Id: I1d323909ffe70757ea56e511556080695b1a0c11
The default behavior is to read a repository's signed push
configuration from that repo's config file, but this is not very
flexible when it comes to managing groups of repositories (e.g. with
Gerrit). Allow callers to override the configuration using a POJO.
Change-Id: Ib8f33e75daa0b2fbd000a2c4558c01c014ab1ce5
Add a missing continues to prevent falling through to the command
parsing section. The first continue happens when the command list is
empty, so change the condition to see whether we have read the first
line, rather than any commands.
Fix comparison to BEGIN_SIGNATURE to use raw line with newline.
Change-Id: If3d92f5ceade8ba7605847a4b2bc55ff17d119ac
a85e817d is a slightly breaking API change to classes that were
technically public and technically released in 4.0. However, it is
highly unlikely that people were actually depending on public behavior,
since there were no public methods to create PushCertificates with
anything other than null field values, or a PushCertificateParser that
did anything other than infinite loop or throw exceptions when reading.
Change-Id: I1d0ba9ea0a347e8ff5a0f4af169d9bb18c5838d2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
UploadPackLogger is incorrectly named--it can be used to trigger any
post upload action, such as GC/compaction. This change introduces
PostUploadHook/PostUploadHookChain to replace
UploadPackLogger/UploadPackLoggerChain and deprecates the latter.
It also introduces PackStatistics as a replacement for
PackWriter.Statistics, since the latter is not public API.
It changes PackWriter to use PackStatistics and reimplements
PackWriter.Statistics to delegate to PackStatistics.
Change-Id: Ic51df1613e471f568ffee25ae67e118425b38986
Signed-off-by: Terry Parker <tparker@google.com>
This method is documented to take a branch name (not a possibly null
string). The only way a caller could have set null without either
re-setting to a sane value afterward or producing NullPointerException
was to also call setNoCheckout(true), in which case there would have
been no reason to set the branch in the first place.
Make setBranch(null) request the default behavior (remote's default
branch) instead, imitating C git's clone --no-branch.
Change-Id: I960e7046b8d5b5bc75c7f3688f3a075d3a951b00
Signed-off-by: Jonathan Nieder <jrn@google.com>
A non-bare clone command with null remote produces a
NullPointerException when trying to produce a refspec to fetch against.
In a bare repository, a null remote name is accepted by mistake,
producing a configuration with items like 'remote.url' instead of
'remote.<remote>.url'. This was never meant to work.
Instead, let's make setRemote(null) undo any previous setRemote calls
and re-set the remote name to DEFAULT_REMOTE, imitating C git clone's
--no-origin option.
While we're here, add some tests for setRemote working normally.
Change-Id: I76f502da5e677df501d3ef387e7f61f42a7ca238
Signed-off-by: Jonathan Nieder <jrn@google.com>
call() throws InvalidRemoteException if uri == null, so there should
never be reason to leave the URI set to null. Document this.
Change-Id: I7f2cdbe8042d99cf8a3c1a8c4c2dcb58c5b8c305
Signed-off-by: Jonathan Nieder <jrn@google.com>
These commands' monitor fields can never be null unless someone passes
null to setProgressMonitor. Anyone passing null probably meant to
disable the ProgressMonitor, so do that (by falling back to
NullProgressMonitor.INSTANCE) instead of saving a null and eventually
producing NullPointerException.
Change-Id: I63ad93ea8ad669fd333a5fd40880e7583ba24827
Signed-off-by: Jonathan Nieder <jrn@google.com>
A project's name attribute is required according to repo's
doc/manifest-format.txt:
<!ELEMENT project (annotation*,
project*)>
<!ATTLIST project name CDATA #REQUIRED>
Enforcing this in code makes reading easier (in particular making it
clearer that getName() and getPath() can never return null).
Change-Id: I8c7014dd6042183d7fecb2202af5acdc384aa8e4
Signed-off-by: Jonathan Nieder <jrn@google.com>
- Consistently return structured data, such as actual ReceiveCommands,
which is more useful for callers that are doing things other than
verifying the signature, e.g. recording the set of commands.
- Store the certificate version field, as this is required to be part
of the signed payload.
- Add a toText() method to recreate the actual payload for signature
verification. This requires keeping track of the un-chomped command
strings from the original protocol stream.
- Separate the parser from the certificate itself, so the actual
PushCertificate object can be immutable. Make a fair attempt at deep
immutability, but this is not possible with the current mutable
ReceiveCommand structure.
- Use more detailed error messages that don't involve NON-NLS strings.
- Document null return values more thoroughly. Instead of having the
undocumented behavior of throwing NPE from certain methods if they
are not first guarded by enabled(), eliminate enabled() and return
null from those methods.
- Add tests for parsing a push cert from a section of pkt-line stream
using a real live stream captured with Wireshark (which, it should
be noted, uncovered several simply incorrect statements in C git's
Documentation/technical/pack-protocol.txt).
This is a slightly breaking API change to classes that were
technically public and technically released in 4.0. However, it is
highly unlikely that people were actually depending on public
behavior, since there were no public methods to create
PushCertificates with anything other than null field values, or a
PushCertificateParser that did anything other than infinite loop or
throw exceptions when reading.
Change-Id: I5382193347a8eb1811032d9b32af9651871372d0
C git's receive-pack.c strips trailing newlines in command lists when
present[1], although send-pack.c does not send them, at least in the
case of command lists[2]. Change JGit to match this behavior.
Add tests.
This also fixes parsing of commands in the push cert, which, unlike
commands sent in the non-push case, always have trailing newlines.
[1] 7974889a05/builtin/receive-pack.c (L1380)
where packet_read_line chomps newlines:
7974889a05/pkt-line.c (L202)
[2] 7974889a05/send-pack.c (L470)
Change-Id: I4bca6342a7482a53c9a5815a94b3c181a479d04b
The new submodule layout where GITDIR of a submodule is located at
<parent-repo-GITDIR>/modules/<submodule-path> was only used during
clone. Teach SubmoduleAddCommand to use the new layout.
Bug: 469666
Change-Id: Ie97dc0607b71499560444616f362bccee9cce515
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* changes:
Allow setting detail message and cause when constructing most exceptions
Use message from ServiceNotAuthorizedException, ServiceNotEnabledException
dumb HTTP: Clarify AsIsFilter by introducing req and res locals
Clarify description of ServiceNotAuthorizedException
Move ObjectCountCallback and WriteAbortedException to package
org.eclipse.jgit.transport, so that they'll become public API.
Change-Id: I95e3cfaa49f3f7371e794d5c253cf6981f87cae0
Signed-off-by: Yuxuan 'fishy' Wang <fishywang@google.com>
This reverts commit 96eb3ee397, which
broke Gerrit tests that set a config value to 'null', serialize the
result, deserialize, and expect 'null' from Config.getString[1].
The intent of that commit was to make it possible to distinguish between
an absent and an empty config value, which we'll have to do with a new
method.
Revert the behavior change. Keep the tests from 428cb23f2de8, since
they test the behavior more precisely than the old tests did.
[1] https://gerrit-review.googlesource.com/68452
Change-Id: Ie8042f380ea0e34e3203e1991aa0feb2e6e44641
Signed-off-by: Jonathan Nieder <jrn@google.com>
Added callback in PackWriter and BundleWriter for the caller to get the
count of objects to write, and a chance to abort the write operation.
Change-Id: I1baeedcc6946b1093652de4a707fe597a577e526
Signed-off-by: Yuxuan 'fishy' Wang <fishywang@google.com>
C git 2.5 supports setting the equivalent of
RequestPolicy.REACHABLE_COMMIT with uploadpack.allowreachablesha1inwant.
Parse this into TransportConfig and use it from UploadPack. An explicitly
set RequestPolicy overrides the config, and the policy may still be
upgraded on a unidirectional connection to avoid races.
Change-Id: Id39771a6e42d8082099acde11249306828a053c0
Signed-off-by: Fredrik Medley <fredrik.medley@gmail.com>
Change-Id: I351cee0ba2e77e3360846ac0c5368da3a322725c
Reported-by: Markus Keller <markus_keller@ch.ibm.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>