Go to file
Shawn O. Pearce 7ba31474a3 Increase core.streamFileThreshold default to 50 MiB
Projects like org.eclipse.mdt contain large XML files about 6 MiB
in size.  So does the Android project platform/frameworks/base.
Doing a clone of either project with JGit takes forever to checkout
the files into the working directory, because delta decompression
tends to be very expensive as we need to constantly reposition the
base stream for each copy instruction.  This can be made worse by
a very bad ordering of offsets, possibly due to an XML editor that
doesn't preserve the order of elements in the file very well.

Increasing the threshold to the same limit PackWriter uses when
doing delta compression (50 MiB) permits a default configured
JGit to decompress these XML file objects using the faster
random-access arrays, rather than re-seeking through an inflate
stream, significantly reducing checkout time after a clone.

Since this new limit may be dangerously close to the JVM maximum
heap size, every allocation attempt is now wrapped in a try/catch
so that JGit can degrade by switching to the large object stream
mode when the allocation is refused.  It will run slower, but the
operation will still complete.

The large stream mode will run very well for big objects that aren't
delta compressed, and is acceptable for delta compressed objects that
are using only forward referencing copy instructions.  Copies using
prior offsets are still going to be horrible, and there is nothing
we can do about it except increase core.streamFileThreshold.

We might in the future want to consider changing the way the delta
generators work in JGit and native C Git to avoid prior offsets once
an object reaches a certain size, even if that causes the delta
instruction stream to be slightly larger.  Unfortunately native
C Git won't want to do that until its also able to stream objects
rather than malloc them as contiguous blocks.

Change-Id: Ief7a3896afce15073e80d3691bed90c6a3897307
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2010-10-04 14:04:47 -05:00
org.eclipse.jgit Increase core.streamFileThreshold default to 50 MiB 2010-10-04 14:04:47 -05:00
org.eclipse.jgit.console Qualify builds as 0.10.0 2010-09-16 17:26:53 -07:00
org.eclipse.jgit.http.server Qualify builds as 0.10.0 2010-09-16 17:26:53 -07:00
org.eclipse.jgit.http.test Qualify builds as 0.10.0 2010-09-16 17:26:53 -07:00
org.eclipse.jgit.iplog Merge "Define DiffAlgorithm as an abstract function" 2010-09-17 15:08:27 -04:00
org.eclipse.jgit.junit Comment the use of System.gc in LocalDiskRepositoryTestCase 2010-09-28 23:27:20 +02:00
org.eclipse.jgit.packaging Fix dependency to jgit source bundle 2010-09-17 22:01:40 +02:00
org.eclipse.jgit.pgm Update Fetch to use FetchCommand API 2010-09-27 10:16:22 -05:00
org.eclipse.jgit.test Increase core.streamFileThreshold default to 50 MiB 2010-10-04 14:04:47 -05:00
org.eclipse.jgit.ui Qualify builds as 0.10.0 2010-09-16 17:26:53 -07:00
tools Clean up LICENSE file 2010-07-02 14:52:49 -07:00
.eclipse_iplog Update .eclipse_iplog for 0.9 2010-09-08 23:17:54 +02:00
.gitattributes Initial JGit contribution to eclipse.org 2009-09-29 16:47:03 -07:00
LICENSE Clean up LICENSE file 2010-07-02 14:52:49 -07:00
README Initial JGit contribution to eclipse.org 2009-09-29 16:47:03 -07:00
SUBMITTING_PATCHES Correcting explanation of EDL 2009-10-28 14:12:07 +01:00
pom.xml Qualify builds as 0.10.0 2010-09-16 17:26:53 -07:00

README

            == Java GIT ==

This package is licensed under the BSD.

  org.eclipse.jgit/

    A pure Java library capable of being run standalone, with no
    additional support libraries.  Some JUnit tests are provided
    to exercise the library.  The library provides functions to
    read and write a GIT formatted repository.

    All portions of jgit are covered by the BSD.  Absolutely no GPL,
    LGPL or EPL contributions are accepted within this package.

  org.eclipse.jgit.test/
    Unit tests for org.eclipse.jgit and the same licensing rules.

            == WARNINGS / CAVEATS              ==

- Symbolic links are not supported because java does not support it.
  Such links could be damaged.

- Only the timestamp of the index is used by jgit check if  the index
  is dirty.

- Don't try the library with a JDK other than 1.6 (Java 6) unless you
  are prepared to investigate problems yourself. JDK 1.5.0_11 and later
  Java 5 versions *may* work. Earlier versions do not. JDK 1.4 is *not*
  supported. Apple's Java 1.5.0_07 is reported to work acceptably. We
  have no information about other vendors. Please report your findings
  if you try.

- CRLF conversion is never performed. On Windows you should thereforc
  make sure your projects and workspaces are configured to save files
  with Unix (LF) line endings.

            == Package Features                ==

  org.eclipse.jgit/

    * Read loose and packed commits, trees, blobs, including
      deltafied objects.

    * Read objects from shared repositories

    * Write loose commits, trees, blobs.

    * Write blobs from local files or Java InputStreams.

    * Read blobs as Java InputStreams.

    * Copy trees to local directory, or local directory to a tree.

    * Lazily loads objects as necessary.

    * Read and write .git/config files.

    * Create a new repository.

    * Read and write refs, including walking through symrefs.

    * Read, update and write the Git index.

    * Checkout in dirty working directory if trivial.

    * Walk the history from a given set of commits looking for commits
      introducing changes in files under a specified path.

    * Object transport
      Fetch via ssh, git, http, Amazon S3 and bundles.
      Push via ssh, git and Amazon S3. JGit does not yet deltify
      the pushed packs so they may be a lot larger than C Git packs.

  org.eclipse.jgit.pgm/

    * Assorted set of command line utilities. Mostly for ad-hoc testing of jgit
      log, glog, fetch etc.

            == Missing Features                ==

There are a lot of missing features. You need the real Git for this.
For some operations it may just be the preferred solution also. There
are not just a command line, there is e.g. git-gui that makes committing
partial files simple.

- Merging. 

- Repacking.

- Generate a GIT format patch.

- Apply a GIT format patch.

- Documentation. :-)

- gitattributes support
  In particular CRLF conversion is not implemented. Files are treated
  as byte sequences.

- submodule support
  Submodules are not supported or even recognized.

            == Support                         ==

  Post question, comments or patches to the git@vger.kernel.org mailing list.


            == Contributing                    ==

  See SUBMITTING_PATCHES in this directory. However, feedback and bug reports
  are also contributions.


            == About GIT                       ==

More information about GIT, its repository format, and the canonical
C based implementation can be obtained from the GIT websites:

  http://git.or.cz/
  http://www.kernel.org/pub/software/scm/git/
  http://www.kernel.org/pub/software/scm/git/docs/