Commit Graph

1678 Commits

Author SHA1 Message Date
Shawn O. Pearce 1eecc82cec Improve performance when writing trees and small blobs
ObjectDirectoryInserter was always creating a temporary file,
writing the complete compressed contents of a tree, fsync()'ing
that to stable storage, and only then checking to see if there
was already an object with the same SHA-1 in the repository.

For commits this strategy makes some sense, the commit is very
unlikely to exist in the repository, as there are embedded times
and these change with each commit.

However for trees coming out of DirCache, it is more common for the
tree to already exist in the repository. Most subdirectories are
not modified in any given commit.  Doing all of this local file IO
for things that already exist is very slow.

Try to detect cases where the object is "small enough" that it can
be processed entirely in memory, and avoid doing disk IO entirely
if the object already exists.

Also increase the size of the output buffer for the deflation.
This should boost the average write(2) syscall size from 512 bytes
to 8192 bytes, making streaming of large compressed contents to
disk slightly more efficient.

Change-Id: I1d40364e8725468522435814631916d73174c92b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-24 12:55:19 -07:00
Shawn O. Pearce 826fb260a3 TemporaryBuffer: Fix reading from in-memory InputStream
I had the conditions wrong here, causing the in-memory InputStream
to always appear to be at EOF.

Change-Id: I6811d6187a34eaf1fd6c5002550d631decdfc391
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-24 12:37:58 -07:00
Chris Aniszczyk 2cebb7dbc7 Add ReflogCommand
Adds a git-reflog command and associated tests.

Bug: 347859
Change-Id: Iba146ac842cc9ca0be43d3381b4082c9e92bf56f
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-06-22 18:32:50 -05:00
Chris Aniszczyk 65606dc086 Refactor out ReflogEntry
It's useful to have ReflogEntry refactored out so it can be
used by clients via the JGit API.

Change-Id: I03044df9af9f9547777545b7c9b93bdf5f8b7cb5
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-06-20 10:25:50 -05:00
Robin Rosenberg 529a348961 RFC: Ugly fix for i18n of metaVar CLI arguments
This patch possibly ties to a specific version of args4j.

Bug: 318286
Change-Id: I05d4ecf6bd25deec7fb2efbfa61913f4ec4e04e5
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-06-20 00:55:13 +02:00
Robin Rosenberg b19924f150 Merge changes Ie393fb8b,Ib11a077a
* changes:
  Push errors back over sideband when possible
  Report progress while updating references
2011-06-17 11:26:38 -04:00
Marc Strapetz 929862f322 Fix IndexOutOfBoundsException when parsing PersonIdent
IndexOutOfBoundsException could occur when parsing
PersonIdent for which no name is present, as part of a
RevCommit (nameB > 0).
2011-06-14 16:56:48 +02:00
Shawn O. Pearce d34ec12019 DHT: Change DhtReadher caches to be dynamic by workload
Instead of fixing the prefetch queue and recent chunk queue as
different sizes, allow these to share the same limit but be scaled
based on the work being performed.

During walks about 20% of the space will be given to the prefetcher,
and the other 80% will be used by the recent chunks cache. This
should improve cases where there is bad locality between chunks.

During writing of a pack stream, 90-100% of the space should be
made available to the prefetcher, as the prefetch plan is usually
very accurate about the order chunks will be needed in.

Change-Id: I1ca7acb4518e66eb9d4138fb753df38e7254704d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-09 19:10:15 -07:00
Shawn O. Pearce 1e6b02643c DHT: Use a proper HashMap for RecentChunk lookups
A linear search is somewhat acceptable for only 4 recent chunks, but
a HashMap based lookup would be better. The table will have 16 slots
by default and given the hashCode() of ChunkKey is derived from the
SHA-1 of the chunk, each chunk will fall into its own bucket within
the table and thus evaluate only 1 entry during lookup instead of 4.

Some users may also want to devote more memory to the recent chunks,
in which case expanding this list to a longer length will help to
reduce chunk faults, but would increase search time. Using a HashMap
will help this code to scale to larger sizes better.

Change-Id: Ia41b7a1cc69ad27b85749e3b74cbf8d0aa338044
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-09 17:59:22 -07:00
Shawn O. Pearce 57853e4949 DHT: Always have at least one recent chunk in DhtReader
The RecentChunks cache assumes there is always at least one recent
chunk in the maxSize that it receives from the DhtReaderOptions.
Ensure that is true by requiring the size to be at least 1.

Running with 0 recent chunk cache is very a bad idea, often
during commit walking the parents of a commit will be found
on the same chunk as the commit that was just accessed. In
these cases its a good idea to keep that last chunk around
so the parents can be quickly accessed.

Change-Id: I33b65286e8a4cbf6ef4ced28c547837f173e065d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-09 17:55:52 -07:00
Shawn O. Pearce 7ff6eb584c Push errors back over sideband when possible
If an internal exception occurs while packing and the request
needs to abort, the HTTP response might already be committed due
to progress message having already been delivered to the client.
This prevents UploadPackServlet from resetting the response and
sending back an HTTP 500 response.

Try to catch all exceptions and report internal errors over the
sideband stream or as an ERR command during the initial ACK/NAK
negotiation phase. This allows JGit to transmit an error message
that the user will receive on their console without needing to
worry about resetting the (already gone) HTTP response.

Change-Id: Ie393fb8bb55d2b79ab1276adf71c781c1807f9fe
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-09 17:29:46 -07:00
Shawn O. Pearce d00f527d65 DHT: Fix NPE during prefetch
The Prefetcher may have loaded a chunk that is a fragment, if the
DhtReader is scanning the Prefetcher's chunks for a particular
object fragment chunks will be missing the index and NPE during
the findOffset() call into the index itself.

Change-Id: Ie2823724c289f745655076c5209acec32361a1ea
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-09 17:29:46 -07:00
Shawn O. Pearce 1a87a725be Report progress while updating references
If a fetch or push needs to apply more than a few references
to the local repository it may take more than 0.25 seconds to
process all of the updates.  This is especially true in the DHT
storage system during an initial push of a project with many tags.
The backend database may need to use a transaction to ensure each
tag reference creation is unique, and there may be large delays
caused by these transactions.

Change-Id: Ib11a077adfbd525253e425d327f2e2c2380804c7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-09 17:29:46 -07:00
Shawn O. Pearce 0e1d5ad8f8 DHT: Drop leading hash digits from row keys
Originally I put the first two digits of the object SHA-1 into the
start of a row key to try and spread the load of objects around a DHT
service. Unfortunately this tends to not work as well as I had hoped.

Servers reading a repository need to contact every node in a DHT
cluster if the cluster tries to evenly distribute the object rows.
This is a lot of connections, especially if the cluster has many
backend storage servers.  If the library has an open connection
limit (possibly due to JVM file descriptor limitations) it may need
to open and close a lot of connections to access a repository,
rather than being able to reuse the same connection to a handful
of backend servers.  This results in a lot of connection thrashing
for some DHT type databases, and is inefficient.

Some DHTs are able to operate even if part of the database space
is currently unavailable.  For example, a DHT service might assign
some section of the key space to a node, and then fail that section
over to another node when the primary is noticed as being offline.
During that failover period that section of the key space is not
available, but other sections hosted by other backends are still
ready for service. Spreading keys all over the cluster makes it
likely that any single backend being temporarily down means the
entire cluster is down, rather than only some.

This is a massive schema change, but it should improve relability
and performance for any DHT system.

Change-Id: I6b65bfb4c14b6f7bd323c2bd0638b49d429245be
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-06-09 17:29:46 -07:00
Matthias Sohn 0ab7be9681 Merge branch 'stable-1.0'
* stable-1.0:
  Prepare post JGit v1.0.0.201106090707-r builds
  JGit v1.0.0.201106090707-r
  Include about.html files in maven build
  Prepare post v1.0.0.201106081625-r builds
  JGit v1.0.0.201106081625-r
  Add missing about.html files to all shipped bundles
  Prepare post v1.0.0.201106071701-r builds
  JGit v1.0.0.201106071701-r
2011-06-09 17:41:16 +02:00
Matthias Sohn 6646c72d17 Prepare post JGit v1.0.0.201106090707-r builds
Change-Id: I35292f9f6fb5ebc591308fdd2d069203413e189d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-09 14:11:23 +02:00
Matthias Sohn b26ff6ebd6 JGit v1.0.0.201106090707-r
Change-Id: Iba44e71b6441a0e39122ca8666b51989e605f25f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-09 13:11:58 +02:00
Matthias Sohn e1af16ad99 Include about.html files in maven build
Change-Id: Ifa96090eb0fc336ee8080385f48212b5158dd9f7
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-09 11:08:07 +02:00
Matthias Sohn 22df55c8b3 Prepare post v1.0.0.201106081625-r builds
Change-Id: I5e6994844405f7839ad3b3439f98bcadb59d329b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-09 11:08:07 +02:00
Matthias Sohn eacd7104a2 JGit v1.0.0.201106081625-r
Change-Id: I629990189083bab4737938ad712080fba7917582
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-08 22:42:20 +02:00
Matthias Sohn 8c5f403c0c Add missing about.html files to all shipped bundles
Change-Id: I5a4ad9493da3816f21d9fdd0b5b977388d074500
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-08 21:51:51 +02:00
Matthias Sohn 9c67a391f1 Prepare post v1.0.0.201106071701-r builds
Change-Id: I67ee2912ef54462cf860dc4ec0a6334e9c619384
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-08 16:32:01 +02:00
Matthias Sohn ac71f9045a JGit v1.0.0.201106071701-r
Change-Id: Ic8f49336ba96c8dcf4bab2f74c0f1efc1ab55131
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-07 23:04:55 +02:00
Matthias Sohn f1713abcdc Prepare 1.1.0 builds
Change-Id: I4cf017cd567543846839612ab3ace6d26233e01d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-06 01:24:32 +02:00
Matthias Sohn 2a1e20ead5 Merge branch 'stable-1.0'
* stable-1.0:
  Prepare post v1.0.0.201106051725-r builds
  JGit v1.0.0.201106051725-r
  Update to eclipse.org's latest SUA
2011-06-06 01:19:52 +02:00
Matthias Sohn 4a4e1f764c Prepare post v1.0.0.201106051725-r builds
Change-Id: I4839877e1a6fa7782f37423213af8d579727a494
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-06 01:17:16 +02:00
Matthias Sohn f65513f753 JGit v1.0.0.201106051725-r
Change-Id: I39f4a23cf284505395d511dfedf02b7f5608df95
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-05 23:26:56 +02:00
Matthias Sohn 0636f9cdda Update to eclipse.org's latest SUA
Change-Id: I0d016ddaed85656c2e680d0bc99829c6ea13b968
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-02 22:53:11 +02:00
Matthias Sohn f0d62fc609 Merge branch 'stable-1.0'
* stable-1.0:
  Prepare post v1.0.0.201106011211-rc3 builds
  JGit v1.0.0.201106011211-rc3
  Remove incubation marker
  blame: Compute the origin of lines in a result file
2011-06-02 01:45:50 +02:00
Matthias Sohn ada903085d Prepare post v1.0.0.201106011211-rc3 builds
Change-Id: I4dec8eba7e35858aef65fcc10f91fad3fe5b52b9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-01 18:55:11 +02:00
Matthias Sohn 81371d385b JGit v1.0.0.201106011211-rc3
Change-Id: I574a05200471c431b3a02ac6ff208dc6aa90f539
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-06-01 18:22:44 +02:00
Matthias Sohn f5f1536f3f Remove incubation marker
Change-Id: I6018ce0cd3b7c8137e137848fe1f04551b257538
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-05-31 22:53:53 +02:00
Shawn O. Pearce a390456047 blame: Compute the origin of lines in a result file
BlameGenerator digs through history and discovers the origin of each
line of some result file.  BlameResult consumes the stream of regions
created by the generator and lays them out in a table for applications
to display alongside of source lines.

Applications may optionally push in the working tree copy of a file
using the push(String, byte[]) method, allowing the application to
receive accurate line annotations for the working tree version.  Lines
that are uncommitted (difference between HEAD and working tree) will
show up with the description given by the application as the author,
or "Not Committed Yet" as a default string.

Applications may also run the BlameGenerator in reverse mode using the
reverse(AnyObjectId, AnyObjectId) method instead of push().  When
running in the reverse mode the generator annotates lines by the
commit they are removed in, rather than the commit they were added in.
This allows a user to discover where a line disappeared from when they
are looking at an older revision in the repository.  For example:

  blame --reverse 16e810b2..master -L 1080, org.eclipse.jgit.test/tst/org/eclipse/jgit/storage/file/RefDirectoryTest.java
           (                                              1080)   }
  2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1081)
  2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1082)   /**
  2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1083)    * Kick the timestamp of a local file.

Above we learn that line 1080 (a closing curly brace of the prior
method) still exists in branch master, but the Javadoc comment below
it has been removed by Christian Halstrick on May 20th as part of
commit 2302a6d3.  This result differs considerably from that of C
Git's blame --reverse feature.  JGit tells the reader which commit
performed the delete, while C Git tells the reader the last commit
that still contained the line, leaving it an exercise to the reader
to discover the descendant that performed the removal.

This is still only a basic implementation.  Quite notably it is
missing support for the smart block copy/move detection that the C
implementation of `git blame` is well known for.  Despite being
incremental, the BlameGenerator can only be run once.  After the
generator runs it cannot be reused.  A better implementation would
support applications browsing through history efficiently.

In regards to CQ 5110, only a little of the original code survives.

CQ: 5110
Bug: 306161
Change-Id: I84b8ea4838bb7d25f4fcdd540547884704661b8f
Signed-off-by: Kevin Sawicki <kevin@github.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-05-31 14:09:30 -05:00
Ketan Padegaonkar 8b8ad75ada Fix a complicated multi level nested if block structure to use a single level with multiple returns.
Change-Id: I3f116f37045e83aba5c80d45b987ab075502dcc6
2011-05-31 09:15:28 -07:00
Shawn O. Pearce 690c268c79 Merge branch 'stable-1.0'
* stable-1.0:
  DHT: Support removing a repository name
  DHT: Fix thread-safety issue in AbstractWriteBuffer
  jgit.sh: Implement pager support
  Change EditList to extend ArrayList
  Ensure the HTTP request is fully consumed
  Make sure test repositories are closed
  Fix CloneCommand not to fetch into remote tracking branches when bare
  Update Eclipse IP log for 1.0

Change-Id: I6340d551482e1dda01f82496296d2038b07fa68b
2011-05-31 09:15:11 -07:00
Shawn O. Pearce 50f236aff8 DHT: Support removing a repository name
The first step to deleting a repository from the DHT storage is to
remove the name binding in the RepositoryIndexTable, making the
repository unavailable for lookup.

Change-Id: I469bf92f4bf2f555a15949569b21937c14cb142b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-05-31 08:58:45 -07:00
Shawn O. Pearce 042a66fe8c DHT: Fix thread-safety issue in AbstractWriteBuffer
There is a data corruption issue with the 'running' list if a
background thread schedules something onto the buffer while the
application thread is also using it.

Change-Id: I5ba78b98b6632965d677a9c8f209f0cf8320cc3d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-05-31 08:58:45 -07:00
Shawn O. Pearce 0a39fb2ab6 jgit.sh: Implement pager support
If the command is either `diff` or `log`, there is often a lot of
lines of output. Run these commands through $GIT_PAGER, $PAGER, or
`less` in order to make it easier to browse the output on a terminal.

Change-Id: I18b87ea4acf404b94788f2ac2101812bd13e6a0f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-05-31 08:58:45 -07:00
Shawn O. Pearce 8d1ac7a769 Change EditList to extend ArrayList
There is no reason for this type to contain an ArrayList and try to
hide the implementation. It only slows down execution by adding an
extra layer of method dispatch to each invocation.

Instead subclass from ArrayList.

Change-Id: Ifbb9c7060c2fe3d5a7397c1aa85fbade14088637
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-05-31 08:58:45 -07:00
Shawn O. Pearce 67a1a0993f Ensure the HTTP request is fully consumed
Some servlet containers require the servlet to read the EOF marker
from the input stream before a response can be output if the stream
is using "Transfer-Encoding: chunked"... which is typical for any
sort of large push to a repository over smart HTTP.

Ensure the EOF is always read by the PackParser when it is handling
the stream, and fail fast if there is more data present than expected
since this does indicate a protocol error.

Also ensure the EOF is read by UploadPack before it starts to output
a partial response using packing progress meters.

Change-Id: I131db9dea20b2324cb7c3272a814f21296bc64bd
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2011-05-31 08:58:45 -07:00
Christian Halstrick c1525e2aa5 Make sure test repositories are closed
Some repositories created during tests are not added to the 'toClose'
list in LocalDiskRepositoryTestCase. Therefore when the tests end
we may have open FileHandles and on Windows this may cause the
tests to fail because we can't delete those files.

This is fixed by adding the possibility to explicitly add
repositories to the list of repos which are closed automatically.

Change-Id: I1261baeef4c7d9aaedd7c34b546393bfa005bbcc
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2011-05-31 08:58:45 -07:00
Christian Halstrick cc2197ed9c Fix CloneCommand not to fetch into remote tracking branches when bare
When cloning into a bare repository we should not create remote
tracking branches (e.g refs/remotes/origin/testX). Branches of the
remote repository should but fetched into into branches of the same
name (e.g refs/heads/testX). Also add the noCheckout option which
would prevent checkout after fetch.

Change-Id: I5d4cc0389f3f30c53aa0065f38119af2a1430909
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2011-05-31 08:58:45 -07:00
Shawn O. Pearce cc319fff0d Merge 'Fix usage of FileSnapshot in RefDirectory' into stable-1.0
* commit '475461d05266fe13b05bc2c6645b9ef928521b4c':
  Fix usage of FileSnapshot in RefDirectory

Change-Id: Ie65bd8b36f4c6a91602a94e9b54a04a4fb335897
2011-05-31 08:35:35 -07:00
Abhishek Bhatnagar b04be93344 CleanCommand: add the ability to do a dry run
Change-Id: I7b81a7e34a771951e2e7b789b080b2bfb8656e5c
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-05-31 09:15:07 -05:00
Shawn Pearce a00b951323 Merge "Fix GitConstructionTest teardown" 2011-05-31 10:01:56 -04:00
Christian Halstrick 475461d052 Fix usage of FileSnapshot in RefDirectory
RefDirectory was not using FileSnapshot correctly in all places. This
is fixed with this commit. Additionally the constructors for the
different types of refs have been changed to take a FileSnapshot
instead of a modification time.

Change-Id: Ifb6a59e87e8b058a398c38cdfb9d648f0bad4bf8
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
2011-05-31 00:22:40 +02:00
Robin Rosenberg 802f84650d Fix GitConstructionTest teardown
The teardown faile on Windows because the repos were not closed.

Change-Id: I16cf5645558680029682f898386b061796948237
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
2011-05-30 22:39:04 +02:00
Matthias Sohn d203485780 Update Eclipse IP log for 1.0
CQ "4876" is jgit's CQ for usage of protobuf, CQ "5135"
is the corresponding Orbit CQ.

Change-Id: I300cf2b5758c7da9c18494325f2f38bb3744e459
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2011-05-25 20:14:35 +02:00
Matthias Sohn 03a6f572b5 Merge "Use the stored password instead of prompting for it all the time" 2011-05-25 12:08:17 -04:00
Shawn O. Pearce b8c508e54d DHT: Add sequence RefData
RefData now uses a sequence number as part of the field, ensuring
that updates always increase the sequence number by one whenever
a reference is modified.

Attaching a sequence number to RefData will help with storing
reference log entries during updates. As the sequence number should
be unique within the reference name space, log entries can be keyed
by the sequence number and remain unique.  Making this work over
reference delete-create cycles will require an additional RefTable
API to return the oldest sequence number previously used in the
reference log to seed the recreated reference.

Change-Id: I11cfff2a96ef962e57f29925a3eef41bdbf9f9bb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
2011-05-25 09:08:33 -05:00