Commit Graph

4916 Commits

Author SHA1 Message Date
Jonathan Nieder 96941550de StreamCopyThread: flush cannot interrupt a write
Because flush calls interrupt with writeLock held, it cannot interrupt
a write.  Simplify by no longer defending against that.

Change-Id: Ib0b39b425335ff7b0ea1b1733562da5392576a15
2016-11-13 13:35:16 -08:00
Jonathan Nieder 97f3baa0d3 StreamCopyThread: Remove unnecessary flushCount
StreamCopyThread#run consistently interrupts itself whenever it
discovers it has been interrupted by StreamCopyThread#flush while not
reading.  The flushCount is not needed to avoid lost flushes.

All in-tree users of StreamCopyThread never flush.  As a nice side
benefit, this avoids the expense of atomic operations that have no
purpose for those users.

Change-Id: I1afe415cd09a67f1891c3baf712a9003ad553062
2016-11-13 13:32:08 -08:00
Shawn Pearce 6aa126ec42 Merge "Switch JSchSession to simple isolated OutputStream" 2016-11-13 16:13:04 -05:00
Hugo Arès dea47b9363 Get rid of SoftReference in RepositoryCache
Now that RepositoryCache have a time based eviction strategy, get rid
of the strategy to evict cache entries if heap memory is running low,
i.e. soft references. Main reason why time based eviction was
implemented was to offer an alternative to the unpredictable soft
references.

Relying on soft references is not working, especially in large heap. The
JVM GC will consider collecting soft references as last resort before
throwing an out of memory error. For example, an application like Gerrit
configured with a 128GB heap, GC will wait until all 128GB is filled
before collecting the soft references so the application will be
suffering long pauses caused by GC for a long time already. In other
words, you will have to restart application because it's unusable before
JVM eviction kicks in.

Keeping the SoftReference in RepositoryCache is causing more harm than
good. If you use the time based eviction (which is the default strategy)
and want to tune JVM to release soft references more aggressively, it
will release repositories from the cache even though they are not
expired which defeats the purpose of the repository cache.

Gerrit uses Lucene library which uses soft references and this is
causing a "memory leak" except if you configure JVM to release soft
references more aggressively which have the nasty side effect of
evicting non expired repositories from the cache.

Change-Id: I9940bd800464c7f007696d0ccde52ea617b2ebce
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
2016-11-13 16:03:02 -04:00
Shawn Pearce 659cd813a9 Switch JSchSession to simple isolated OutputStream
Work around issues with JSch not handling interrupts by
isolating the JSch interactions onto another thread.

Run write and flush on a single threaded Executor using
simple Callable operations wrapping the method calls,
waiting on the future to determine the outcome before
allowing the caller to continue.

If any operation was interrupted the state of the stream
becomes fuzzy at close time. The implementation tries to
interrupt the pending write or flush, but this is very
likely to corrupt the stream object, so exceptions are
ignored during such a dirty close.

Change-Id: I42e3ba3d8c35a2e40aad340580037ebefbb99b53
2016-11-13 11:02:29 -08:00
Shawn Pearce 92eab1867d WalkEncryption: Cleanup Java 8 support
Java 8 is now the minimum for JGit, so Java 7
only paths are not necessary.

Change-Id: I0151625fed4d0da95321ebed5cca648b8c29d5f1
2016-11-13 12:17:20 -04:00
Philipp Marx df6f2d6860 Reduce synchronized scope around ConcurrentHashMap
Change-Id: I982a78070efb6bc2d3395330456d62e0d5ce6da7
Signed-off-by: Philipp Marx <smigfu@googlemail.com>
2016-11-12 11:11:19 +01:00
Philipp Marx 8adbfe4da6 Check that DfsBlockCache#blockSize is a power of 2
In case a value is used which isn’t a power of 2 there will be a high
chance of java.lang.ArrayIndexOutBoundsException and
org.eclipse.jgit.errors.CorruptObjectException due to a mismatching
assumption for the DfsBlockCache#blockSizeShift parameter.

Change-Id: Ib348b3704edf10b5f93a3ffab4fa6f09cbbae231
Signed-off-by: Philipp Marx <smigfu@googlemail.com>
2016-11-11 10:43:09 +01:00
Matthias Sohn f8ac03459a Fix loop in auto gc
* GC.tooManyLooseObjects() always responded true since the loop missed
to advance the iterator so it always incremented until the threshold was
exceeded.
* Also fix loop exit criterion which was off by 1.
* Add some tests.

Change-Id: I70976dfaa026efbcf3c46bd45941f37277a18e04
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-11-07 22:31:10 +01:00
David Pursehouse 23135e3280 Update buck to latest version
Update to the same version used on Gerrit's master branch.

Change-Id: I20e4edd099a095c42f23df8cc57241efad2de2ce
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-11-06 23:51:46 +01:00
Jonathan Nieder 881e6b2cbb StreamCopyThread: Do not drop data when flush is observed before writing
StreamCopyThread.flush was introduced in
61645b938bc934fda3b0624c5bac1e3495634750 (Add timeouts to smart
transport clients, 2009-06-19) to support timeouts on write in JSch.
The commit message from that change explains:

   JSch made a timeout on write difficult because they explicitly do
   a catch for InterruptedException inside of their OutputStream.  We
   have to work around that by creating an additional thread that just
   shuttles data between our own OutputStream and the real JSch stream.

The code that runs on that thread is structured as follows:

	while (!done) {
		int n = src.read(buf);
		dst.write(buf, 0, n);
	}

with src being a PipedInputStream representing the data to be written
to JSch.  To add flush support, that change wanted to add an extra step

		if (wantFlush)
			dst.flush();

but to handle the case where the thread is blocked in the read() call
waiting for new input, it needs to interrupt the read. So that is how
it works: the caller runs

	pipeOut.write(some data);
	pipeOut.flush();
	copyThread.flush();

to write some data and force it to flush by interrupting the read.

After the pipeOut.flush(), the StreamCopyThread reads the data that was
written and prepares to copy it out.  If the copyThread.flush() call
interrupts the copyThread before it acquires writeLock and starts
writing, we throw away the data we just read to fulfill the flush.
Oops.

Noticed during the review of e67d59df3f
(StreamCopyThread: Do not let flush interrupt a write, 2016-11-04),
which introduced this bug.

Change-Id: I4aceb5610e1bfb251046097adf46bca54bc1d998
2016-11-04 19:33:02 -04:00
Jonathan Nieder e67d59df3f StreamCopyThread: Do not let flush interrupt a write
flush calls interrupt() to interrupt a pending read and trigger a
flush.  Unfortunately that interrupt() call can also interrupt a
pending write, putting Jsch in a bad state and triggering "Short read
of block" errors.  Add locking to ensure the flush only interrupts
reads as intended.

Change-Id: Ib105d9e107ae43549ced7e6da29c22ee41cde9d8
2016-11-04 13:00:08 -07:00
Zhen Chen feefcb02b0 Fix flush call race condition in StreamCopyThread
If there was a new flush() call during flush previous bytes, we need to
catch it in order to process the new bytes between the two flush()
calls instead of going to last catch IOException clause and end the
thread.

Change-Id: Ibc58a1fa97559238c13590aedbb85e482d85e465
Signed-off-by: Zhen Chen <czhen@google.com>
2016-10-31 14:31:48 -07:00
Thomas Wolf d0023c3c8f Don't serialize internal hash collision chain link
ObjectId is serializable, and so are its subtypes. Ensure that
serialization does not follow the hash collision chain internal to the
ObjectIdOwnerMap, otherwise completely unrelated objects may get
serialized when a RevObject is serialized.

Note that serializing a RevCommit or RevTag may serialize quite a few
objects due to the parent/object links they contain. A user has no real
control over how many objects will be written when a RevCommit is
serialized. C.f [1]. This change does not resolve that, but in any case
this internal hash collision chain link should not participate in
serialization.

[1] https://github.com/gitblit/gitblit/pull/1141

Change-Id: Ice331a9dc80a59ca360fcc04adaff8b5e750d847
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
2016-10-29 11:39:36 +02:00
Matthias Sohn 83555e7e30 Use AtomicObjectOutputStream in CleanFilter
Enhance and use AtomicObjectOutputStream to write temporary files in
CleanFilter.

Change-Id: I28987dad18255a9067344f94b4e836cbd183e4b1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-26 23:19:49 +02:00
Matthias Sohn 0e947da72f CleanFilter: use atomic move to move temporary file to media file
Change-Id: I227a0ed6e4e15ac3d96f96a6cefcaf55680ad8bb
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-26 23:15:08 +02:00
Matthias Sohn 999106bb84 Fix temporary file leak in CleanFilter
The CleanFilter leaked temporary files when a media file already existed
before running clean filter.

Change-Id: Ie20fce3f40d34095ce58e596d25d8d64fe0cde99
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-26 23:10:03 +02:00
Matthias Sohn 4b7747ccff Use AnyLongObjectId instead of LongObjectId in LFS API
Change-Id: I083ad1ea3e8d3685df7c306854c2498c92b05ffb
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-26 23:03:39 +02:00
Matthias Sohn 6dea5ec823 Speedup CleanFilter by transferring data in chunks of 8k
Transferring data byte per byte is slow, running add with CleanFilter on
a 2.9MB file takes 20 seconds. Using a buffer of 8k shrinks this time to
70ms.

Change-Id: I3bc2d8c11fe6cfaffcc99dc2a00643e01ac4e9cc
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-26 22:54:48 +02:00
Matthias Sohn d1bc809cce Add missing @since tag for new protected field in ObjectReader
Change-Id: I93d67d7fd2fde55be39480944d9d7072dbb6c600
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-24 16:05:24 +02:00
Kevin Corcoran fa0a93119c Make streamFileThreshold configurable
Previously, the streamFileThreshold, the threshold at which a file
would be streamed rather than loaded entirely into memory, was only
configurable on a global basis.

This commit makes this threshold configurable on a per-loader basis.

Bug: 490404
Change-Id: I492c18c3155dbf56eedda9044a61d76120fd75f9
Signed-off-by: Kevin Corcoran <kevin.corcoran@puppetlabs.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-24 14:00:02 +09:00
David Pursehouse 88f433be84 Merge "Preserve backslashes within double quotes in CLIGitCommand::split()" 2016-10-23 19:18:38 -04:00
Rüdiger Herrmann a5eccf4a4d Preserve backslashes within double quotes in CLIGitCommand::split()
Change-Id: Ia6a56512baa6a0f27e2eef1b19ebb60291ba377f
Signed-off-by: Rüdiger Herrmann <ruediger.herrmann@gmx.de>
2016-10-23 15:29:17 +09:00
Christian Halstrick f30c05fc74 Move constants used for config-files to ConfigConstants
Change-Id: I7d8db4bfa1a851afd599bb8eaa8f8273204d2e1d
2016-10-23 01:39:32 +02:00
Matthias Sohn 64a404803e Implement auto gc
With the auto option, gc checks whether any housekeeping is required; if
not, it exits without performing any work. Some JGit commands run gc
--auto after performing operations that could create many loose objects.
Housekeeping is required if there are too many loose objects or too many
packs in the repository.

If the number of loose objects exceeds the value of the gc.auto option
jgit's GC consolidates all existing packs into a single pack (equivalent
to -A option), whereas git-core would combine all loose objects into a
single pack using repack -d -l. Setting the value of gc.auto to 0
disables automatic packing of loose objects.

If the number of packs exceeds the value of gc.autoPackLimit, then
existing packs (except those marked with a .keep file) are consolidated
into a single pack by using the -A option of repack. Setting
gc.autoPackLimit to 0 disables automatic consolidation of packs.

Like git the following jgit commands run auto gc:
- fetch
- merge
- rebase
- receive-pack

The auto gc for receive-pack can be suppressed by setting the config
option receive.autogc = false

Change-Id: I68a2a051b39ec2c53cb7c4b8f6c596ba65eeba5d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-23 01:34:31 +02:00
David Pursehouse 03046d0f60 CheckoutCommand: Add method to add multiple paths
The new method addPaths(List<String>) allows callers to add multiple
paths without having to iterate over several calls to addPath(String).

Change-Id: I2c3746a97ead7118fb0ed5543a2c843224719031
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-22 10:11:54 +09:00
Ned Twigg e49025386e Checkout: Add the ability to checkout all paths.
Change-Id: Ie1e59c566b63d0dfac231e44e7ebd7f3f08f3e9f
Signed-off-by: Ned Twigg <ned.twigg@diffplug.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-22 10:08:07 +09:00
Marc Strapetz c6459a6167 Fix possible SIOOBE in RefDirectory.parsePackedRefs
This SIOOBE happens reproducibly when trying to access
a repository containing Cygwin symlinks

Change-Id: I25f103fcc723bac7bfaaeee333a86f11627a92c7
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-21 18:13:04 +09:00
Thomas Meyer 4ab06388ad TransportBundleFile: Resolve remote repository locally
Remove the assumption that the local repository is a file based one.

Change-Id: I8f10fe7a54e9fc07f2a23d7901e52b65aa570d45
Signed-off-by: Thomas Meyer <thomas.mey@web.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-21 00:24:52 +02:00
David Turner e346873511 TreeFormatter: disallow empty filenames in trees
Git barfs on these (and they don't make any sense), so we certainly
shouldn't write them.

Change-Id: I3faf8554a05f0fd147be2e63fbe55987d3f88099
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-19 22:24:24 +09:00
Philipp Marx ccc899773e Add "concurrencyLevel" option to DfsBlockCache
Allow for higher concurrency on DfsBlockCache by adding a configuration
for number of estimated concurrent requests.

Change-Id: Ia65e58ecb2c459b6d9c9697a2f715d933270f7e6
Signed-off-by: Philipp Marx <smigfu@googlemail.com>
2016-10-19 21:45:30 +09:00
David Pursehouse a5dde985a0 DiffCommandTest: Don't call toString on String instances
Change-Id: Ib308b3498593d595b3d8741a9b2d241bbc7441c3
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-19 15:09:37 +09:00
David Pursehouse 9a7d28019a FileNameMatcherTest: Use Character.valueOf rather than new Character
Change-Id: I9d6e20a258d34ae1d2700fbe8e6c6e3b0ba94424
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-19 15:09:37 +09:00
David Pursehouse 8f9e157cd5 ArchiveTest: Don't use string concatenation in loop
According to FindBugs:

  In each iteration, the String is converted to a StringBuffer/
  StringBuilder, appended to, and converted back to a String. This
  can lead to a cost quadratic in the number of iterations, as the
  growing string is recopied in each iteration.

Replace string concatenation with StringBuffer.

Change-Id: I60e09f274bed6722f4e0e4d096b0f2b1b31ec1b4
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-19 15:09:31 +09:00
David Pursehouse 9ed2d949bb CLIRepositoryTestCase: Remove unused 'trash' member
Change-Id: I813f3de5f059e6e5cd34af20fce1e117bfe55b55
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-19 12:51:14 +09:00
David Pursehouse a963273d85 Merge branch 'stable-4.5'
* stable-4.5:
  Config: do not add spaces before units

Change-Id: I54185f54e6d78d7aac873ee5f990f09582318857
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-19 11:45:11 +09:00
David Turner a66b4c29a8 Config: do not add spaces before units
Adding a space before the unit ('g', 'm', 'k) causes git to fail with
the error:

  fatal: bad numeric config value

Change-Id: I57f11d3a1cdcca4549858e773af1a2a80fc0369f
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-19 10:58:52 +09:00
David Pursehouse c0433f4fb7 Use valueOf rather than constructor for Integer and Boolean
Change-Id: I1c65b2e40ba6ec5860903b11b4631e014f3dc5ce
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-18 14:15:53 +09:00
David Pursehouse bdf3e43d76 FS: Fix lazy initialization of non-volatile static field
The 'factory' field is lazy initialized in the detect() method.

According to FindBugs:

   Because the compiler or processor may reorder instructions, threads
   are not guaranteed to see a completely initialized object, if the
   method can be called by multiple threads.

Fix this by declaring the member as 'volatile'.

Change-Id: Ib32663bb28c9564584256e01f625b4e7875e6223
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-18 14:15:53 +09:00
David Pursehouse e9107e853f PackOutputStream: Add comment for intentional use of non-short-circuit logic
To avoid that people try to "fix" it.

Change-Id: Ib4b35e357e4c068a17243ebd2d57b058c54d5834
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-18 14:15:53 +09:00
David Pursehouse a3c0a7f9c4 Git{Add|Clone}Task: Catch specific exceptions rather than Exception
Change-Id: If3db5a1375485e97f9811546e310e441475db1a6
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-18 14:15:53 +09:00
David Pursehouse 4e3c5e1f13 Main: Add missing $NON-NLS tag
Change-Id: I030910b88a8f60ca174e38f0a213959f9b0a776f
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-18 13:18:29 +09:00
David Pursehouse 7e542cbe19 Status: remove unused lineFormat member
Change-Id: I3c4d83583edb1a6e1fbee1ea496dcf93302831b3
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-18 01:06:30 +02:00
David Pursehouse 08649a9fd0 LfsStore: Don't invoke toString on String variable
Change-Id: I15d234e5d907d0bbb22a95cf781e915798bead30
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-18 01:06:29 +02:00
David Pursehouse e0d1cfb5ad Upgrade buck to 7b7817c48f30687781040b2b82ac9218d5c4eaa4
Upgrade to match the version used on Gerrit's master branch.

Requires a couple of modifications to make the tests work:

- Remove source_under_test parameters from java_test calls.

- Add vm_args with explicit setting of tmpdir location for http
  tests. This is needed due to upstream changes in temporary
  directory handling [1].

[1] https://github.com/facebook/buck/issues/946

Change-Id: I5d5dd5edc335d44b118e8587f69ba89b83fc7fbb
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
2016-10-18 01:06:28 +02:00
Christian Halstrick fb4abc7f86 Merge "Fix JGit CLI to follow native git's interpretation of http_proxy..." 2016-10-17 02:41:19 -04:00
Christian Halstrick 293e8beacc Fix JGit CLI to follow native git's interpretation of http_proxy...
Native git (as many other tools) interprets the environment variables
http_proxy, HTTP_PROXY, ... in a specific way. "http_proxy" has to be
lowercase while "https_proxy" can be lowercase or uppercase (means:
"HTTPS_PROXY"). Lowercase has precedence. This can be looked up in
"ENVIRONMENT" section of [1]. Teach JGit CLI to behave similar.

Additionally teach JGit not to interpret the environment variables if
the java process was explicitly started with the system properties
telling JVM which proxy to use. A call like "http_proxy=proxy1 java
-Dhttp.proxyHost=proxy2 ..." should use proxy2 as proxy.

[1] https://curl.haxx.se/docs/manpage.html

Change-Id: I2ad78f209792bf8f1285cf2f8ada8ae0c28f8e5a
2016-10-14 15:14:21 +02:00
Matthias Sohn ba7ba7a816 Merge branch 'stable-4.5'
* stable-4.5:
  Unconditionally close repositories in RepositoryCache.clear()
  Fix eviction of repositories with negative usage count

Adapt to parameter removed from
RepositoryCache.unregisterAndCloseRepository().

Change-Id: I7087667056ced401a3b3a027977f2715cd77a1c5
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-14 00:24:00 +02:00
Matthias Sohn 535f0afd13 Unconditionally close repositories in RepositoryCache.clear()
Earlier we tried to close the repository before removing it from the
cache, so close only reduced refcount but didn't close it.

Now that we no longer leak usage count on purpose and the usage count is
now ignored anyway, there is no longer a need to run the removal twice.

Change-Id: I8b62cec6d8a3e88c096d1f37a1f7f5a5066c90a0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-13 23:39:12 +02:00
Matthias Sohn b8e3e194e3 HttpClientConnection: Register connection socket factory for http
It is necessary to register a socket connection factory to prevent the
"http protocol is not supported" error when connecting over a proxy.

Change-Id: Iedf554acef841f52c1f2e3401ef0a0583ac5253b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-10-13 12:29:45 +02:00