Go to file
Terry Parker d385a7a5e5 Shallow fetch: Respect "shallow" lines
When fetching from a shallow clone, the client sends "have" lines
to tell the server about objects it already has and "shallow" lines
to tell where its local history terminates. In some circumstances,
the server fails to honor the shallow lines and fails to return
objects that the client needs.

UploadPack passes the "have" lines to PackWriter so PackWriter can
omit them from the generated pack. UploadPack processes "shallow"
lines by calling RevWalk.assumeShallow() with the set of shallow
commits. RevWalk creates and caches RevCommits for these shallow
commits, clearing out their parents. That way, walks correctly
terminate at the shallow commits instead of assuming the client has
history going back behind them. UploadPack converts its RevWalk to an
ObjectWalk, maintaining the cached RevCommits, and passes it to
PackWriter.

Unfortunately, to support shallow fetches the PackWriter does the
following:

  if (shallowPack && !(walk instanceof DepthWalk.ObjectWalk))
    walk = new DepthWalk.ObjectWalk(reader, depth);

That is, when the client sends a "deepen" line (fetch --depth=<n>)
and the caller has not passed in a DepthWalk.ObjectWalk, PackWriter
throws away the RevWalk that was passed in and makes a new one. The
cleared parent lists prepared by RevWalk.assumeShallow() are lost.
Fortunately UploadPack intends to pass in a DepthWalk.ObjectWalk.
It tries to create it by calling toObjectWalkWithSameObjects() on
a DepthWalk.RevWalk. But it doesn't work: because DepthWalk.RevWalk
does not override the standard RevWalk#toObjectWalkWithSameObjects
implementation, the result is a plain ObjectWalk instead of an
instance of DepthWalk.ObjectWalk.

The result is that the "shallow" information is thrown away and
objects reachable from the shallow commits can be omitted from the
pack sent when fetching with --depth from a shallow clone.

Multiple factors collude to limit the circumstances under which this
bug can be observed:

1. Commits with depth != 0 don't enter DepthGenerator's pending queue.
   That means a "have" cannot have any effect on DepthGenerator unless
   it is also a "want".

2. DepthGenerator#next() doesn't call carryFlagsImpl(), so the
   uninteresting flag is not propagated to ancestors there even if a
   "have" is also a "want".

3. JGit treats a depth of 1 as "1 past the wants".

Because of (2), the only place the UNINTERESTING flag can leak to a
shallow commit's parents is in the carryFlags() call from
markUninteresting(). carryFlags() only traverses commits that have
already been parsed: commits yet to be parsed are supposed to inherit
correct flags from their parent in PendingGenerator#next (which
doesn't happen here --- that is (2)). So the list of commits that have
already been parsed becomes relevant.

When we hit the markUninteresting() call, all "want"s, "have"s, and
commits to be unshallowed have been parsed. carryFlags() only
affects the parsed commits. If the "want" is a direct parent of a
"have", then it carryFlags() marks it as uninteresting. If the "have"
was also a "shallow", then its parent pointer should have been null
and the "want" shouldn't have been marked, so we see the bug. If the
"want" is a more distant ancestor then (2) keeps the uninteresting
state from propagating to the "want" and we don't see the bug. If the
"shallow" is not also a "have" then the shallow commit isn't parsed
so (2) keeps the uninteresting state from propagating to the "want
so we don't see the bug.

Here is a reproduction case (time flowing left to right, arrows
pointing to parents). "C" must be a commit that the client
reports as a "have" during negotiation. That can only happen if the
server reports it as an existing branch or tag in the first round of
negotiation:

  A <-- B <-- C <-- D

First do

  git clone --depth 1 <repo>

which yields D as a "have" and C as a "shallow" commit. Then try

  git fetch --depth 1 <repo> B:refs/heads/B

Negotiation sets up: have D, shallow C, have C, want B.
But due to this bug B is marked as uninteresting and is not sent.

Change-Id: I6e14b57b2f85e52d28cdcf356df647870f475440
Signed-off-by: Terry Parker <tparker@google.com>
2016-08-05 18:37:36 -04:00
lib Support LFS protocol and a file system based LFS storage 2016-02-04 17:49:43 +01:00
org.eclipse.jgit Shallow fetch: Respect "shallow" lines 2016-08-05 18:37:36 -04:00
org.eclipse.jgit.ant Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.ant.test Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.archive Archive: Make project name consistent with other subprojects' 2016-07-26 18:58:17 +09:00
org.eclipse.jgit.http.apache Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.http.server Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.http.test Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.junit Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.junit.http Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.lfs Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.lfs.server Fix typo in email address in copyright headers 2016-07-28 16:05:22 +09:00
org.eclipse.jgit.lfs.server.test Fix typo in email address in copyright headers 2016-07-28 16:05:22 +09:00
org.eclipse.jgit.lfs.test Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.packaging Remove duplicate LFS feature from P2 repository 2016-07-05 00:00:22 +02:00
org.eclipse.jgit.pgm LfsProtocolServlet: Pass request and path to getLargeFileRepository 2016-07-27 15:54:03 +09:00
org.eclipse.jgit.pgm.test Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
org.eclipse.jgit.test Shallow fetch: Respect "shallow" lines 2016-08-05 18:37:36 -04:00
org.eclipse.jgit.ui Ignore 'The value of exception parameter is not used' warning 2016-07-26 10:16:49 +09:00
tools Run Maven build in release.sh concurrently to speedup release 2016-05-04 17:34:35 +02:00
.buckconfig Buck: Simplify root build file 2016-02-14 11:45:30 +01:00
.buckversion Update buck to e64a2e2ada022f81e42be750b774024469551398 2016-04-21 16:31:24 +09:00
.gitattributes Initial JGit contribution to eclipse.org 2009-09-29 16:47:03 -07:00
.gitignore Add .buckd to .gitignore 2016-04-19 09:40:55 -04:00
.mailmap Update .mailmap 2016-02-12 12:39:06 +09:00
BUCK Buck: Simplify root build file 2016-02-14 11:45:30 +01:00
CONTRIBUTING.md Update SUBMITTING_PATCHES 2014-07-20 17:44:53 -04:00
LICENSE Clean up LICENSE file 2010-07-02 14:52:49 -07:00
README.md FS: Remove the gitprefix logic 2015-05-22 09:37:35 +02:00
pom.xml JGit v4.4.1.201607150455-r 2016-07-15 10:54:36 +02:00

README.md

Java Git

An implementation of the Git version control system in pure Java.

This package is licensed under the EDL (Eclipse Distribution License).

JGit can be imported straight into Eclipse, built and tested from there, but the automated builds use Maven.

  • org.eclipse.jgit

    A pure Java library capable of being run standalone, with no additional support libraries. It provides classes to read and write a Git repository and operate on a working directory.

    All portions of JGit are covered by the EDL. Absolutely no GPL, LGPL or EPL contributions are accepted within this package.

  • org.eclipse.jgit.java7

    Extensions for users of Java 7.

  • org.eclipse.jgit.ant

    Ant tasks based on JGit.

  • org.eclipse.jgit.archive

    Support for exporting to various archive formats (zip etc).

  • org.eclipse.jgit.http.apache

    Apache httpclient support

  • org.eclipse.jgit.http.server

    Server for the smart and dumb Git HTTP protocol.

  • org.eclipse.jgit.pgm

    Command-line interface Git commands implemented using JGit ("pgm" stands for program).

  • org.eclipse.jgit.packaging

    Production of Eclipse features and p2 repository for JGit. See the JGit Wiki on why and how to use this module.

Tests

  • org.eclipse.jgit.junit

    Helpers for unit testing

  • org.eclipse.jgit.test

    Unit tests for org.eclipse.jgit

  • org.eclipse.jgit.java7.test

    Unit tests for Java 7 specific features

  • org.eclipse.jgit.ant.test

  • org.eclipse.jgit.pgm.test

  • org.eclipse.jgit.http.test

  • org.eclipse.jgit.junit.test

    No further description needed

Warnings/Caveats

  • Native smbolic links are supported, but only if you are using Java 7 or newer and include the org.eclipse.jgit.java7 jar/bundle in the classpath, provided the file system supports them. For Windows you must have Windows Vista/Windows 2008 or newer, use a non-administrator account and have the SeCreateSymbolicLinkPrivilege.

  • Only the timestamp of the index is used by jgit if the index is dirty.

  • JGit requires at least a Java 7 JDK.

  • CRLF conversion is performed depending on the core.autocrlf setting, however Git for Windows by default stores that setting during installation in the "system wide" configuration file. If Git is not installed, use the global or repository configuration for the core.autocrlf setting.

  • The system wide configuration file is located relative to where C Git is installed. Make sure Git can be found via the PATH environment variable. When installing Git for Windows check the "Run Git from the Windows Command Prompt" option. There are other options like Eclipse settings that can be used for pointing out where C Git is installed. Modifying PATH is the recommended option if C Git is installed.

  • We try to use the same notation of $HOME as C Git does. On Windows this is often not the same value as the user.home system property.

Package Features

  • org.eclipse.jgit/

    • Read loose and packed commits, trees, blobs, including deltafied objects.

    • Read objects from shared repositories

    • Write loose commits, trees, blobs.

    • Write blobs from local files or Java InputStreams.

    • Read blobs as Java InputStreams.

    • Copy trees to local directory, or local directory to a tree.

    • Lazily loads objects as necessary.

    • Read and write .git/config files.

    • Create a new repository.

    • Read and write refs, including walking through symrefs.

    • Read, update and write the Git index.

    • Checkout in dirty working directory if trivial.

    • Walk the history from a given set of commits looking for commits introducing changes in files under a specified path.

    • Object transport Fetch via ssh, git, http, Amazon S3 and bundles. Push via ssh, git and Amazon S3. JGit does not yet deltify the pushed packs so they may be a lot larger than C Git packs.

    • Garbage collection

    • Merge

    • Rebase

    • And much more

  • org.eclipse.jgit.pgm/

    • Assorted set of command line utilities. Mostly for ad-hoc testing of jgit log, glog, fetch etc.
  • org.eclipse.jgit.java7/

    • Support for symbolic links.

    • Optimizations for reading file system attributes

  • org.eclipse.jgit.ant/

    • Ant tasks
  • org.eclipse.jgit.archive/

    • Support for Zip/Tar and other formats
  • org.eclipse.http.*/

    • HTTP client and server support

Missing Features

There are some missing features:

  • gitattributes support

Support

Post question, comments or patches to the jgit-dev@eclipse.org mailing list. You need to be subscribed to post, see here:

https://dev.eclipse.org/mailman/listinfo/jgit-dev

Contributing

See the EGit Contributor Guide:

http://wiki.eclipse.org/EGit/Contributor_Guide

About Git

More information about Git, its repository format, and the canonical C based implementation can be obtained from the Git website:

http://git-scm.com/