jgit/org.eclipse.jgit/.settings
Matthias Sohn 96d9e3eb19 Prevent infinite loop rescanning the pack list on PackMismatchException
We found, when analysing an incident where Gerrit's gc runner thread got
stuck, that we can end up in an infinite loop in
ObjectDirectory#openPackedObject which tries to rescan the pack
list and starts over trying to open a packed object in an unconfined
loop if it catches a PackMismatchException.

Here the relevant part of a thread dump we created while the gc runner
was stuck:

"WorkQueue-2[java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@350812a3[Not
completed,
task = java.util.concurrent.Executors$RunnableAdapter@5425d7ee]]" #72
tid=0x00007f73cee1c800 nid=0x584
runnable  [0x00007f7392d57000]
   java.lang.Thread.State: RUNNABLE
	at org.eclipse.jgit.internal.storage.file.WindowCache.removeAll(WindowCache.java:716)
	at org.eclipse.jgit.internal.storage.file.WindowCache.purge(WindowCache.java:399)
	at org.eclipse.jgit.internal.storage.file.PackFile.close(PackFile.java:296)
	at org.eclipse.jgit.internal.storage.file.ObjectDirectory.reuseMap(ObjectDirectory.java:973)
	at org.eclipse.jgit.internal.storage.file.ObjectDirectory.scanPacksImpl(ObjectDirectory.java:904)
	at org.eclipse.jgit.internal.storage.file.ObjectDirectory.scanPacks(ObjectDirectory.java:895)
	- locked <0x000000050a498f60> (a
java.util.concurrent.atomic.AtomicReference)
	at org.eclipse.jgit.internal.storage.file.ObjectDirectory.searchPacksAgain(ObjectDirectory.java:794)
	at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedObject(ObjectDirectory.java:465)
	at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedFromSelfOrAlternate(ObjectDirectory.java:417)
	at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObject(ObjectDirectory.java:408)
	at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:132)
	at org.eclipse.jgit.lib.ObjectReader$1.open(ObjectReader.java:279)
	at org.eclipse.jgit.revwalk.RevWalk$2.next(RevWalk.java:1031)
	at org.eclipse.jgit.internal.storage.pack.PackWriter.findObjectsToPack(PackWriter.java:1911)
	at org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:960)
	at org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:876)
	at org.eclipse.jgit.internal.storage.file.GC.writePack(GC.java:1168)
	at org.eclipse.jgit.internal.storage.file.GC.repack(GC.java:852)
	at org.eclipse.jgit.internal.storage.file.GC.doGc(GC.java:269)
	at org.eclipse.jgit.internal.storage.file.GC.gc(GC.java:220)
	at org.eclipse.jgit.api.GarbageCollectCommand.call(GarbageCollectCommand.java:179)
	at com.google.gerrit.server.git.GarbageCollection.run(GarbageCollection.java:112)
	at com.google.gerrit.server.git.GarbageCollection.run(GarbageCollection.java:75)
	at com.google.gerrit.server.git.GarbageCollection.run(GarbageCollection.java:71)
	at com.google.gerrit.server.git.GarbageCollectionRunner.run(GarbageCollectionRunner.java:76)
	at com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:103)
	at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.18/Executors.java:515)
	at java.util.concurrent.FutureTask.runAndReset(java.base@11.0.18/FutureTask.java:305)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11.0.18/ScheduledThreadPoolExecutor.java:305)
	at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:612)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.18/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.18/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.18/Thread.java:829)

The code in ObjectDirectory#openPackedObject [1] apparently assumes that
this is caused by a transient problem which it can resume from by
retrying. We use `core.trustFolderStat = false` on this server since it
uses NFS. The incident we had showed that we can enter into an infinite
loop here if there is a permanent mismatch between a pack file and its
corresponding pack index. I am not yet sure how this can happen.

Break the infinite loop by limiting the number of attempts rescanning
the pack list to 5 retries.  When we exceed this threshold set the type
of the PackMismatchException to permanent and rethrow it which breaks
the infinite loop.

Also apply the same limit in #getPackedObjectSize
and #selectObjectRepresentation where we use similar retry loops.

[1] 011c26ff36/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/ObjectDirectory.java (465)

Change-Id: I20fb63bcc1fdc3a03d39b963f06a90e6f0ba73dc
2023-04-19 16:29:44 +02:00
..
.api_filters Prevent infinite loop rescanning the pack list on PackMismatchException 2023-04-19 16:29:44 +02:00
org.eclipse.core.resources.prefs Initial JGit contribution to eclipse.org 2009-09-29 16:47:03 -07:00
org.eclipse.core.runtime.prefs Fix line endings 2010-06-18 23:36:18 +02:00
org.eclipse.jdt.core.prefs Enable and fix "Statement unnecessarily nested within else clause" warnings 2019-10-17 10:20:14 +09:00
org.eclipse.jdt.ui.prefs Partially revert c0ad77d8 "Enhance Eclipse save actions" 2017-08-30 03:07:18 +02:00
org.eclipse.mylyn.tasks.ui.prefs Use commit message best practices for Mylyn Commit template 2011-09-05 23:57:21 +02:00
org.eclipse.mylyn.team.ui.prefs Fix Mylyn commit message template 2018-09-23 04:11:58 -04:00
org.eclipse.pde.api.tools.prefs Ignore warning for minor version change without API change 2017-11-24 01:12:14 +01:00
org.eclipse.pde.core.prefs Adding PDE API Tools nature to JGit 2010-01-16 10:00:30 -06:00