Go to file
Shawn O. Pearce b24f907e3e Buffer very large delta streams to reduce explosion of CPU work
Large delta streams are unpacked incrementally, but because a delta
can seek to a random position in the base to perform a copy we may
need to inflate the base repeatedly just to complete one delta.
So work around it by copying the base to a temporary file, and then
we can read from that temporary file using random seeks instead.
Its far more efficient because we now only need to inflate the
base once.

This is still really ugly because we have to dump to a temporary
file, but at least the code can successfully process a large
file without throwing OutOfMemoryError.  If speed is an
issue, the user will need to increase the JVM heap and ensure
core.streamFileThreshold is set to a higher value, so we don't use
this code path as often.

Unfortunately we lose the "optimization" of skipping over portions
of a delta base that we don't actually need in the final result.
This is going to cause us to inflate and write to disk useless
regions that were deleted and do not appear in the final result.
We could later improve on our code by trying to flatten delta
instruction streams before we touch the bottom base object, and
then only store the portions of the base we really need for the
final result and that appear out-of-order.  Since that is some
pretty complex code I'm punting on it for now and just doing this
simple whole-object buffering.

Because the process umask might be permitting other users to read
files we create, we put the temporary buffers into $GIT_DIR/objects.
We can reasonably assume that if a reader can read our temporary
buffer file in that directory, they can also read the base pack
file we are pulling it from and therefore its not a security breach
to expose the inflated content in a file.  This requires a reader
to have write access to the repository, but only if the file is
really big.  I'd rather err on the side of caution here and refuse
to read a very big file into /tmp than to possibly expose a secured
content because the Java 5 JVM won't let us create a protected
temporary file that only the current user can access.

Change-Id: I66fb80b08cbcaf0f65f2db0462c546a495a160dd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2010-08-27 13:28:35 -07:00
org.eclipse.jgit Buffer very large delta streams to reduce explosion of CPU work 2010-08-27 13:28:35 -07:00
org.eclipse.jgit.console Run formatter on edited lines via save action 2010-08-26 12:33:09 -05:00
org.eclipse.jgit.http.server Run formatter on edited lines via save action 2010-08-26 12:33:09 -05:00
org.eclipse.jgit.http.test Run formatter on edited lines via save action 2010-08-26 12:33:09 -05:00
org.eclipse.jgit.iplog Run formatter on edited lines via save action 2010-08-26 12:33:09 -05:00
org.eclipse.jgit.junit Merge "Use JUnit4 for tests" 2010-08-26 14:50:05 -04:00
org.eclipse.jgit.packaging Hide Maven target directories from Eclipse 2010-08-08 13:16:53 +02:00
org.eclipse.jgit.pgm Remove unused import 2010-08-26 23:53:41 +02:00
org.eclipse.jgit.test Add TagCommand 2010-08-27 21:11:31 +02:00
org.eclipse.jgit.ui Run formatter on edited lines via save action 2010-08-26 12:33:09 -05:00
tools Clean up LICENSE file 2010-07-02 14:52:49 -07:00
.eclipse_iplog eclipse-iplog: Use contribution rather than bug element 2010-05-28 15:09:29 -07:00
.gitattributes Initial JGit contribution to eclipse.org 2009-09-29 16:47:03 -07:00
LICENSE Clean up LICENSE file 2010-07-02 14:52:49 -07:00
README Initial JGit contribution to eclipse.org 2009-09-29 16:47:03 -07:00
SUBMITTING_PATCHES Correcting explanation of EDL 2009-10-28 14:12:07 +01:00
pom.xml Start 0.9 development 2010-06-14 08:11:27 -07:00

README

            == Java GIT ==

This package is licensed under the BSD.

  org.eclipse.jgit/

    A pure Java library capable of being run standalone, with no
    additional support libraries.  Some JUnit tests are provided
    to exercise the library.  The library provides functions to
    read and write a GIT formatted repository.

    All portions of jgit are covered by the BSD.  Absolutely no GPL,
    LGPL or EPL contributions are accepted within this package.

  org.eclipse.jgit.test/
    Unit tests for org.eclipse.jgit and the same licensing rules.

            == WARNINGS / CAVEATS              ==

- Symbolic links are not supported because java does not support it.
  Such links could be damaged.

- Only the timestamp of the index is used by jgit check if  the index
  is dirty.

- Don't try the library with a JDK other than 1.6 (Java 6) unless you
  are prepared to investigate problems yourself. JDK 1.5.0_11 and later
  Java 5 versions *may* work. Earlier versions do not. JDK 1.4 is *not*
  supported. Apple's Java 1.5.0_07 is reported to work acceptably. We
  have no information about other vendors. Please report your findings
  if you try.

- CRLF conversion is never performed. On Windows you should thereforc
  make sure your projects and workspaces are configured to save files
  with Unix (LF) line endings.

            == Package Features                ==

  org.eclipse.jgit/

    * Read loose and packed commits, trees, blobs, including
      deltafied objects.

    * Read objects from shared repositories

    * Write loose commits, trees, blobs.

    * Write blobs from local files or Java InputStreams.

    * Read blobs as Java InputStreams.

    * Copy trees to local directory, or local directory to a tree.

    * Lazily loads objects as necessary.

    * Read and write .git/config files.

    * Create a new repository.

    * Read and write refs, including walking through symrefs.

    * Read, update and write the Git index.

    * Checkout in dirty working directory if trivial.

    * Walk the history from a given set of commits looking for commits
      introducing changes in files under a specified path.

    * Object transport
      Fetch via ssh, git, http, Amazon S3 and bundles.
      Push via ssh, git and Amazon S3. JGit does not yet deltify
      the pushed packs so they may be a lot larger than C Git packs.

  org.eclipse.jgit.pgm/

    * Assorted set of command line utilities. Mostly for ad-hoc testing of jgit
      log, glog, fetch etc.

            == Missing Features                ==

There are a lot of missing features. You need the real Git for this.
For some operations it may just be the preferred solution also. There
are not just a command line, there is e.g. git-gui that makes committing
partial files simple.

- Merging. 

- Repacking.

- Generate a GIT format patch.

- Apply a GIT format patch.

- Documentation. :-)

- gitattributes support
  In particular CRLF conversion is not implemented. Files are treated
  as byte sequences.

- submodule support
  Submodules are not supported or even recognized.

            == Support                         ==

  Post question, comments or patches to the git@vger.kernel.org mailing list.


            == Contributing                    ==

  See SUBMITTING_PATCHES in this directory. However, feedback and bug reports
  are also contributions.


            == About GIT                       ==

More information about GIT, its repository format, and the canonical
C based implementation can be obtained from the GIT websites:

  http://git.or.cz/
  http://www.kernel.org/pub/software/scm/git/
  http://www.kernel.org/pub/software/scm/git/docs/