Go to file
Shawn O. Pearce f5fe2dca3c Teach PackWriter how to reuse an existing object list
Counting the objects needed for packing is the most expensive part of
an UploadPack request that has no uninteresting objects (otherwise
known as an initial clone).  During this phase the PackWriter is
enumerating the entire set of objects in this repository, so they can
be sent to the client for their new clone.

Allow the ObjectReader (and therefore the underlying storage system)
to keep a cached list of all reachable objects from a small number of
points in the project's history.  If one of those points is reached
during enumeration of the commit graph, most objects are obtained from
the cached list instead of direct traversal.

PackWriter uses the list by discarding the current object lists and
restarting a traversal from all refs but marking the object list name
as uninteresting.  This allows PackWriter to enumerate all objects
that are more recent than the list creation, or that were on side
branches that the list does not include.

However, ObjectWalk tags all of the trees and commits within the list
commit as UNINTERESTING, which would normally cause PackWriter to
construct a thin pack that excludes these objects.  To avoid that,
addObject() was refactored to allow this list-based enumeration to
always include an object, even if it has been tagged UNINTERESTING by
the ObjectWalk.  This implies the list-based enumeration may only be
used for initial clones, where all objects are being sent.

The UNINTERESTING labeling occurs because StartGenerator always
enables the BoundaryGenerator if the walker is an ObjectWalk and a
commit was marked UNINTERESTING, even if RevSort.BOUNDARY was not
enabled.  This is the default reasonable behavior for an ObjectWalk,
but isn't desired here in PackWriter with the list-based enumeration.
Rather than trying to change all of this behavior, PackWriter works
around it.

Because the list name commit's immediate files and trees were all
enumerated before the list enumeration itself starts (and are also
within the list itself) PackWriter runs the risk of adding the same
objects to its ObjectIdSubclassMap twice.  Since this breaks the
internal map data structure (and also may cause the object to transmit
twice), PackWriter needs to use a new "added" RevFlag to track whether
or not an object has been put into the outgoing list yet.

Change-Id: Ie99ed4d969a6bb20cc2528ac6b8fb91043cee071
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-01-27 09:38:19 -08:00
org.eclipse.jgit Teach PackWriter how to reuse an existing object list 2011-01-27 09:38:19 -08:00
org.eclipse.jgit.console Qualify post 0.10 builds 2010-12-17 15:49:30 +01:00
org.eclipse.jgit.http.server Build http.server source JAR 2011-01-20 14:46:27 -08:00
org.eclipse.jgit.http.test Require the hamcrest packaging that comes with Eclipse 2011-01-01 19:05:00 +01:00
org.eclipse.jgit.iplog Qualify post 0.10 builds 2010-12-17 15:49:30 +01:00
org.eclipse.jgit.junit Convert all JGit unit tests to JUnit 4 2010-12-31 14:00:05 -08:00
org.eclipse.jgit.junit.http Convert all JGit unit tests to JUnit 4 2010-12-31 14:00:05 -08:00
org.eclipse.jgit.packaging Qualify post 0.10 builds 2010-12-17 15:49:30 +01:00
org.eclipse.jgit.pgm Fixed several NPEs in the Fetch CLI 2011-01-26 11:38:59 -06:00
org.eclipse.jgit.test Hard reset should not report conflict on untracked file 2011-01-27 17:20:04 +01:00
org.eclipse.jgit.ui Qualify post 0.10 builds 2010-12-17 15:49:30 +01:00
tools Clean up LICENSE file 2010-07-02 14:52:49 -07:00
.eclipse_iplog Update .eclipse_iplog for 0.9 2010-09-08 23:17:54 +02:00
.gitattributes Initial JGit contribution to eclipse.org 2009-09-29 16:47:03 -07:00
LICENSE Clean up LICENSE file 2010-07-02 14:52:49 -07:00
README Initial JGit contribution to eclipse.org 2009-09-29 16:47:03 -07:00
SUBMITTING_PATCHES Correcting explanation of EDL 2009-10-28 14:12:07 +01:00
pom.xml Cleanup configuration of Maven JUnit runner 2011-01-02 14:35:04 -08:00

README

            == Java GIT ==

This package is licensed under the BSD.

  org.eclipse.jgit/

    A pure Java library capable of being run standalone, with no
    additional support libraries.  Some JUnit tests are provided
    to exercise the library.  The library provides functions to
    read and write a GIT formatted repository.

    All portions of jgit are covered by the BSD.  Absolutely no GPL,
    LGPL or EPL contributions are accepted within this package.

  org.eclipse.jgit.test/
    Unit tests for org.eclipse.jgit and the same licensing rules.

            == WARNINGS / CAVEATS              ==

- Symbolic links are not supported because java does not support it.
  Such links could be damaged.

- Only the timestamp of the index is used by jgit check if  the index
  is dirty.

- Don't try the library with a JDK other than 1.6 (Java 6) unless you
  are prepared to investigate problems yourself. JDK 1.5.0_11 and later
  Java 5 versions *may* work. Earlier versions do not. JDK 1.4 is *not*
  supported. Apple's Java 1.5.0_07 is reported to work acceptably. We
  have no information about other vendors. Please report your findings
  if you try.

- CRLF conversion is never performed. On Windows you should thereforc
  make sure your projects and workspaces are configured to save files
  with Unix (LF) line endings.

            == Package Features                ==

  org.eclipse.jgit/

    * Read loose and packed commits, trees, blobs, including
      deltafied objects.

    * Read objects from shared repositories

    * Write loose commits, trees, blobs.

    * Write blobs from local files or Java InputStreams.

    * Read blobs as Java InputStreams.

    * Copy trees to local directory, or local directory to a tree.

    * Lazily loads objects as necessary.

    * Read and write .git/config files.

    * Create a new repository.

    * Read and write refs, including walking through symrefs.

    * Read, update and write the Git index.

    * Checkout in dirty working directory if trivial.

    * Walk the history from a given set of commits looking for commits
      introducing changes in files under a specified path.

    * Object transport
      Fetch via ssh, git, http, Amazon S3 and bundles.
      Push via ssh, git and Amazon S3. JGit does not yet deltify
      the pushed packs so they may be a lot larger than C Git packs.

  org.eclipse.jgit.pgm/

    * Assorted set of command line utilities. Mostly for ad-hoc testing of jgit
      log, glog, fetch etc.

            == Missing Features                ==

There are a lot of missing features. You need the real Git for this.
For some operations it may just be the preferred solution also. There
are not just a command line, there is e.g. git-gui that makes committing
partial files simple.

- Merging. 

- Repacking.

- Generate a GIT format patch.

- Apply a GIT format patch.

- Documentation. :-)

- gitattributes support
  In particular CRLF conversion is not implemented. Files are treated
  as byte sequences.

- submodule support
  Submodules are not supported or even recognized.

            == Support                         ==

  Post question, comments or patches to the git@vger.kernel.org mailing list.


            == Contributing                    ==

  See SUBMITTING_PATCHES in this directory. However, feedback and bug reports
  are also contributions.


            == About GIT                       ==

More information about GIT, its repository format, and the canonical
C based implementation can be obtained from the GIT websites:

  http://git.or.cz/
  http://www.kernel.org/pub/software/scm/git/
  http://www.kernel.org/pub/software/scm/git/docs/