Commit Graph

6430 Commits

Author SHA1 Message Date
Matthias Sohn 68c08265aa Add missing @since tags for new API methods
This was missed in d3b40e72ac.

Change-Id: I6e90157c6be34ae6618e246b02cf77631c8e9732
2023-07-25 23:22:46 +02:00
Matthias Sohn 8879eec460 Add missing package import needed to use MurmurHash3
This was missed in 49beb5ae51 and broke the OSGi classpath.

Change-Id: I08a307e9e3aade4ed8a5b5e2cc5e5d03c57dfa56
2023-07-25 22:06:27 +02:00
Jonathan Tan c77fb93478 Merge "Identify a commit that generates a diffEntry on a rename Event." 2023-07-25 12:09:40 -04:00
Ronald Bhuleskar ec3d919aa5 Identify a commit that generates a diffEntry on a rename Event.
When using FollowFilter's rename callback, a callback is generated with the diff. The caller that is interested in the renames knows what the diff's are but have no idea what commit generated that diff.

This will allow FollowFilter's rename callback to track diffEntry for a given commit.

Change-Id: If1e63ccd19fdcb9c58c59137110fe24e0ce023d2
2023-07-24 19:42:51 -04:00
Jonathan Tan 0f4af2bc36 Merge changes I60a92463,Ic3b68220
* changes:
  PackReverseIndexV1: reverse index parsed from version 1 file
  ComputedPackReverseIndex: Clarify custom bucket sort algorithm
2023-07-21 14:05:38 -04:00
Anna Papitto 000e7caf5e PackReverseIndexV1: reverse index parsed from version 1 file
The reverse index for a pack is used to quickly find an object's
position in the pack's forward index based on that object's pack offset.
It is currently computed from the forward index by sorting the index
entries by the corresponding pack offset. This computation uses
insertion sort, which has an average runtime of O(n^2).

Cgit persists a pack reverse index file
to avoid recomputing the reverse index ordering. Instead they write a
file with format
https://git-scm.com/docs/pack-format#_pack_rev_files_have_the_format
which can later be read and parsed into the in-memory reverse index
each time it is needed.

PackReverseIndexV1 parses a reverse index file with the official
version 1 format into an in-memory representation of the reverse index
which implements methods to find an object's forward index position
from its offset in logorithmic time.

Change-Id: I60a92463fbd6a8cc9c1c7451df1c14d0a21a0f64
Signed-off-by: Anna Papitto <annapapitto@google.com>
2023-07-18 15:19:26 -07:00
Anna Papitto 7d2669587f ComputedPackReverseIndex: Clarify custom bucket sort algorithm
The ComputedPackReverseIndex uses a custom sorting algorithm, based on
bucket sort with insertion sort but with the data managed as a linked
list across two int arrays. This custom algorithm relies on the set of
values being sorted being exactly 0, ..., n-1; so that they can serve a
second purpose of being indexes into a second equally sized list.

This custom algorithm was introduced ~10 years ago in
6cc532a43c.
The original author is no longer an active contributor, so it is
valuable for the code to be readable, especially as there is currently
active work on reverse indexes.

Rename variables and add comments to clarify the algorithm and improve
readability. There are no functional changes to the algorithm.

Change-Id: Ic3b682203f20e06f9f865f81259e034230f9720a
Signed-off-by: Anna Papitto <annapapitto@google.com>
2023-07-18 15:19:20 -07:00
Ronald Bhuleskar 3b77e33ad8 CommitGraphWriter: add option for writing/using bloom filters
Currently, bloom filters are written and used without any way to turn
them off. Add a per-repo config variable to control whether bloom
filters are written. As for reading, add a JGit option to control this.
(A JGit option is used instead of a per-repo config variable as there is
usually no reason not to use the bloom filters if they are present, but
a global control to disable them is useful if there turns out to be an
issue with the implementation of bloom filters.)

The config that controls reading is the same as C Git, but the config
for writing is not: C Git has no config to control writing, but whether
bloom filters are written depends on whether bloom filters are already
present and what arguments are passed to "git commit-graph write". See
the manpage of "git commit-graph" for more information.

Change-Id: I1b7b25340387673506252b9260b22bfe147bde58
2023-07-18 14:21:48 -07:00
Jonathan Tan 77aec62141 CommitGraphWriter: reuse changed path filters
Teach CommitGraphWriter to reuse changed path filters that have been
read from the commit graph file whenever possible.

Change-Id: I1acbfa1613ca7198386a49209028886af360ddb6
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
2023-07-18 14:21:48 -07:00
Jonathan Tan d3b40e72ac RevWalk: use changed path filters
Teach RevWalk, TreeRevFilter, PathFilter, and FollowFilter to use
changed path filters, whenever available, to speed revision walks by
skipping commits that fail the changed path filter.

This work is based on earlier work by Kyle Zhao
(I441be984b609669cff77617ecfc838b080ce0816).

Change-Id: I7396f70241e571c63aabe337f6de1b8b9800f7ed
Signed-off-by: kylezhao <kylezhao@tencent.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
2023-07-18 14:21:48 -07:00
Jonathan Tan ff0f7c174f CommitGraphLoader: read changed-path filters
As described in the parent commit, add support for reading the BIDX and
BDAT chunks of the commit graph file, as described in man gitformat-
commit-graph(5).

This work is based on earlier work by Kyle Zhao
(I160f6b022afaa842c331fb9a086974e49dced7b2).

Change-Id: I82e02e6a3a3b758e6bf9d7bbd2198f0ffe3a331b
Signed-off-by: kylezhao <kylezhao@tencent.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
2023-07-18 14:21:48 -07:00
Jonathan Tan 49beb5ae51 CommitGraphWriter: write changed-path filters
Add support for writing the BIDX and BDAT chunks of the commit graph
file, as described in man gitformat-commit-graph(5). The ability to read
such chunks will be added in a subsequent commit.

This work is based on earlier work by Kyle Zhao
(Ib863782af209f26381e3ca0a2c119b99e84b679c).

Change-Id: Ic18e6f0eeec7da1e1ff31751aabda5e6952dbe6e
Signed-off-by: kylezhao <kylezhao@tencent.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
2023-07-18 14:21:48 -07:00
Matthias Sohn 5dc63514d0 Merge "ssh: PKCS#11 support" 2023-07-17 18:13:06 -04:00
Thomas Wolf 23758d7a61 ssh: PKCS#11 support
Support PKCS#11 HSMs (like YubiKey PIV) for SSH authentication.

Use the SunPKCS11 provider as described at [1]. This provider
dynamically loads the library from the PKCS11Provider SSH configuration
and creates a Java KeyStore with that provider. A Java CallbackHandler
is needed to feed PIN prompts from the KeyStore into the JGit
CredentialsProvider framework. Because the JGit CredentialsProvider may
be specific to a SSH session but the PKCS11Provider may be used by
several sessions, the CallbackHandler needs to be configurable per
session.

PIN prompts respect the NumberOfPasswordPrompts SSH configuration. As
long as the library asks only for a PIN, we use the KeyPasswordProvider
to prompt for it. This gives automatic integration in Eclipse with the
Eclipse secure storage, so a user has even the option to store the PIN
there. (Eclipse will then ask for the secure storage master password on
first access, so the usefulness of this is debatable.)

By default the provider uses the first PKCS#11 token (slot list index
zero). This can be overridden by a non-standard PKCS11SlotListIndex
ssh configuration entry. (For OpenSSH interoperability, also set
"IgnoreUnknown PKCS11SlotListIndex" in the SSH config file then.)

Once loaded, the provider and its shared library and the keys
contained remain available until the application exits.

Manually tested using SoftHSM. See file manual_tests.txt. Kudos to
Christopher Lamb for additional manual testing with a real YubiKey,
also on Windows.[2]

[1] https://docs.oracle.com/en/java/javase/11/security/pkcs11-reference-guide1.html
[2] https://www.eclipse.org/forums/index.php/t/1113295/

Change-Id: I544c97e1e24d05e28a9f0e803fd4b9151a76ed11
Signed-off-by: Thomas Wolf <twolf@apache.org>
2023-07-17 04:52:30 -04:00
Matthias Sohn db08835c6c GC: Remove handling of extra pack for RefTree
RefTree was packed in its own packfile, see
Icbb735be8fa91ccbf0708ca3a219b364e11a6b83.

RefTree was deleted in Ia3da7f2b82d9e365cec2ccf9397cbc47439cd150, since
it was experimental and never used productively. This change missed to
remove the extra pack handling for RefTree.

Change-Id: I8c0d0a66440c331c3d03d0e07d5629682af2a7a9
2023-07-17 00:57:21 +02:00
Matthias Sohn 010a14f24d Remove unused API problem filters
Change-Id: Iea5fb0bf7b2c6a14d7d8b55558f6e78d3fd523f1
2023-07-16 15:13:05 +02:00
Matthias Sohn b2f7dc189a Remove redundant specification of type arguments
Change-Id: I8289e0a6ca9154d6411993d250176a35df7cb905
2023-07-16 15:11:17 +02:00
Ivan Frade 760bdd09b1 DfsPackParser: Create object indices if config says so
The DfsInserter writes the pack and its indices in the flush() method,
but when the writing happens via DfsPackParser, it is the parser which
writes the pack and indices. When combined with a parser, flushing the
inserter is a noop.

Add the writing of the object size index to the packparser#parse
method, mirroring how the primary index is written.

Change-Id: I52c5db153fea7e4a8ecd8b3d5de7ad21f7f81a60
2023-07-14 10:51:18 -07:00
Ivan Frade cb99ff5bbb DfsInserter: generate object size index if config says so
DfsInserter receives objects and on flush() writes a pack and its
primary index.

Teach the DfsInserter to write also the object size index if the
config says so.

Change-Id: I89308312f8fd898d4c714a9b68ff948d3663800b
2023-07-14 10:34:46 -07:00
Ivan Frade 4d2a003b91 DfsInserter: populate full size on object insertion
We need the full size of the object to populate the object size index
later.

Save the size the PackedObjectInfo while adding objects to the
pack. Then we don't need to re-read it from the pack at indexing time.

Change-Id: I5bd7ad402df60b4637038def8ef7be2ab45faf87
2023-07-14 10:25:20 -07:00
Ivan Frade 12a4a4ccaa DFSGarbargeCollector: Write object size indices
PackWriter knows how to add an object size index to the pack, but the
garbage collector is not using it yet.

Teach DfsGarbageCollector to write the object size index on
writePack(). Disable by default in the unreachable-garbage pack.

Callers control the content/presence of the index through the
PackConfig option (minBytesForObjSizeIndex) for all other packs, so
there is no need of a specific flag in DfsGarbageCollector.

Change-Id: I86f5f17310e6913381125bec4caab32dc45b7c9d
2023-07-14 10:25:06 -07:00
Ivan Frade 9dace2e6d6 DfsReader/PackFile: Implement isNotLargerThan using the obj size idx
isNotLargerThan() can avoid reading the size of a blob from disk using
the object size idx if available.

Load the object size index in the DfsPackfile following the same
pattern than the other indices. Override isNotLargerThan in DfsReader
to use the index when available.

Following CL introduces the writing of the object size index and the
tests cover this code.

Change-Id: I15c95b84c1424707c487a7d29c5c46b1a9d0ceba
2023-07-13 11:24:17 -07:00
Anna Papitto 91b23cc552 DfsPackFile: make #getReverseIdx public
The DfsPackFile#getReverseIdx method, which wraps creating a
PackReverseIndex in caching, was package-private. This caused
implementations on top of DfsPackFile to directly instantiate a
PackReverseIndex in cases where it would benefit from caching.

Instead, make #getReverseIdx public so that the caching logic can be
reused by implementations where appropriate.

Change-Id: I4553e514a4ac320bfe2455c00023343ad97f9d15
Signed-off-by: Anna Papitto <annapapitto@google.com>
2023-06-27 13:25:29 -07:00
Anna Papitto 8e61971620 PackReverseIndex: separate out the computed implementation
PackReverseIndex is a concrete class whose implementation is computed
from a pack's forward index. Callers which have a reverse index file may
want to use an implementation that is file-based instead.

Generalize PackReverseIndex into an interface without
implementation-specific logic and separate out the logic for the
computed implementation into a new concrete class.

Change-Id: I98d9835363c5e1c8c3c11a81b0761af3cdeaa41a
Signed-off-by: Anna Papitto <annapapitto@google.com>
2023-06-21 14:04:12 -07:00
Thomas Wolf faefa90f99 Default for global (user) git ignore file
C git has a default for git config core.excludesfile: "Its default
value is $XDG_CONFIG_HOME/git/ignore. If $XDG_CONFIG_HOME is either
not set or empty, $HOME/.config/git/ignore is used instead." [1]

Implement this in the WorkingTreeIterator$RootIgnoreNode.

To make this testable, mock the "user.home" directory for all JGit
tests, otherwise tests might pick up a real user's git ignore file.
Also ensure that JGit code always reads "user.home" via the
SystemReader.

Add tests for both locations.

[1] https://git-scm.com/docs/gitignore#_description

Bug: 436127
Change-Id: Ie510259320286c3c13a6464a37da1bd9ca1e373a
Signed-off-by: Thomas Wolf <twolf@apache.org>
2023-06-19 08:19:29 +02:00
Antoine Musso 7b955048eb Fix all Javadoc warnings and fail on them
This fixes all the javadoc warnings, stops ignoring doclint 'missing'
category and fails the build on javadoc warnings for public and
protected classes and class members.

Since javadoc doesn't allow access specifiers when specifying doclint
configuration we cannot set `-Xdoclint:all,-missing/private`
hence there is no simple way to skip private elements from doclint.
Therefore we check javadoc using the Eclipse Java compiler
(which is used by default) and javadoc configuration in
`.settings/org.eclipse.jdt.core.prefs` files.
This allows more fine grained configuration.

We can reconsider this when javadoc starts supporting access specifiers
in the doclint configuration.

Below are detailled explanations for most modifications.

@inheritDoc
===========
doclint complains about explicits `{@inheritDoc}` when the parent does
not have any documentation. As far as I can tell, javadoc defaults to
inherit comments and should only be used when one wants to append extra
documentation from the parent. Given the parent has no documentation,
remove those usages which doclint complains about.

In some case I have moved up the documentation from the concrete class
up to the abstract class.

Remove `{@inheritDoc}` on overriden methods which don't add additional
documentation since javadoc defaults to inherit javadoc of overridden
methods.

@value to @link
===============
In PackConfig, DEFAULT_SEARCH_FOR_REUSE_TIMEOUT and similar are forged
from Integer.MAX_VALUE and are thus not considered constants (I guess
cause the value would depends on the platform). Replace it with a link
to `Integer.MAX_VALUE`.

In `StringUtils.toBoolean`, @value was used to refer to the
`stringValue` parameter. I have replaced it with `{@code stringValue}`.

{@link <url>} to <a>
====================
@link does not support being given an external URL. Replaces them with
HTML `<a>`.

@since: being invalid
=====================

org.eclipse.jgit/src/org/eclipse/jgit/util/Equality.java has an invalid
tag `@since: ` due to the extra `:`. Javadoc does not complain about it
with version 11.0.18+10 but does with 11.0.19.7. It is invalid
regardless.

invalid HTML syntax
===================

- javadoc doesn't allow <br/>, <p/> and </p> anymore, use <br> and <p>
instead
- replace <tt>code</tt> by {@code code}
- <table> tags don't allow summary attribute, specify caption as
<caption>caption</caption> to fix this

doclint visibility issue
========================

In the private abstract classes `BaseDirCacheEditor` and
`BasePackConnection` links to other methods in the abstract class are
inherited in the public subclasses but doclint gets confused and
considers them unreachable. The HTML documentation for the sub classes
shows the relative links in the sub classes, so it is all correct. It
must be a bug somewhere in javadoc.
Mute those warnings with: @SuppressWarnings("doclint:missing")

Misc
====
Replace `<` and `>` with HTML encoded entities (`&lt; and `&gt;`).
In `SshConstants` I went enclosing a serie of -> arrows in @literal.

Additional tags
===============
Configure maven-javad0c-plugin to allow the following additional tags
defined in https://openjdk.org/jeps/8068562:
- apiNote
- implSpec
- implNote

Missing javadoc
===============
Add missing @params and descriptions

Change-Id: I840056389aa59135cfb360da0d5e40463ce35bd0
Also-By: Matthias Sohn <matthias.sohn@sap.com>
2023-06-16 01:08:13 +02:00
Antoine Musso c7960910f0 Mark COMMIT_GENERATION_* constants final
In org.eclipse.jgit.lib.Constants the constants are all marked final
with the exception of:

- COMMIT_GENERATION_UNKOWN
- COMMIT_GENERATION_NOT_COMPUTED

They were introduced by cf70e7cbe4 without the `final` keyword while
other constants have it which certainly has been forgotten.

The javadoc `{@value}` tag causes raises a warning about the fields not
being constants which is how I have discovered the ommission.

Change-Id: I0ad87f42355440c7d50158e773a280a0526e9671
2023-06-09 16:40:35 +02:00
Anna Papitto 74547f4a68 PackReverseIndex: use static builder instead of constructor
PackReverseIndex instances are created using the constructor directly,
which limits control over the construction logic and refactoring
opportunities for the class itself. These will be needed for a
file-based implementation of the reverse index.

Use a static builder method to create a PackReverseIndex instance using
a pack's forward index.

Change-Id: I4421d907cd61d9ac932df5377e5e28a81679b63f
Signed-off-by: Anna Papitto <annapapitto@google.com>
2023-05-31 10:09:50 +02:00
Anna Papitto 181b629f7d Gc#writePack: write the reverse index file to disk
The reverse index is currently created in-memory when needed. A writer
for reverse index files was already implemented.

Make garbage collection write the reverse index file when the PackConfig
enables it. Write it during #writePack, which mirrors how the primary
index is written.

Change-Id: I50131af6622c41a7b24534aaaf2a423ab4178981
Signed-off-by: Anna Papitto <annapapitto@google.com>
2023-05-31 10:09:50 +02:00
Matthias Sohn 9afff3e808 Prepare 6.7.0-SNAPSHOT builds
Change-Id: I50ff7ee31046cfc29a087c8963be3deae24b1c9c
2023-05-24 17:31:26 +02:00
Jonathan Tan 44461b215e Merge "GraphObjectIndex: fix search in findGraphPosition" 2023-05-23 18:26:47 -04:00
Jonathan Tan 6b3b2b33a5 GraphObjectIndex: fix search in findGraphPosition
In findGraphPosition, when there is no object whose OID starts with
the first byte of the sought OID, low equals high. This violates an
invariant of the loop, and when the sought OID is lexicographically
greater than every other OID in the repository, causes an
ArrayIndexOutOfBoundsException (because we're trying to read outside the
list of OIDs).

Therefore, check the "low < high" condition at the start of the loop,
not only after the first iteration.

Change-Id: Ic8ac198c151bd161c4996b9e7cb6e6660f151733
Helped-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
2023-05-23 13:57:32 -07:00
Matthias Sohn 0eedb1affb Merge "Also add suppressed exception if unchecked exception occurs in finally" 2023-05-19 13:49:42 -04:00
Matthias Sohn 2cbf0c1774 Also add suppressed exception if unchecked exception occurs in finally
If a method called in a finally block throws an exception we should add
exceptions caught earlier to the exception we throw in the finally block
not regarding if it's a checked or unchecked exception.

Change-Id: I4c6be9a3a08482b07659ca31d6987ce719d81ca5
2023-05-18 21:10:33 +02:00
Fabio Ponciroli 334852c52f Candidate: use "Objects.equals" instead of "=="
Errorprone raises the following warning:
"[ReferenceEquality] Comparison using reference equality
instead of value equality".

Change-Id: Iacb207ef0625bb987a08406d4e7461e48fade97f
2023-05-18 05:37:11 -04:00
Matthias Sohn 6f2d93fb8d Remove unused $NON-NLS-1$
Change-Id: I3314e5106d873c03903562f9798de6af2ae588a7
2023-05-17 17:01:33 +02:00
Matthias Sohn 0f7d485bc9 Remove unused API filters
Change-Id: I1971b31753fd4c3568016e7db955cce8e391a1e0
2023-05-17 17:01:33 +02:00
Thomas Wolf 43954ea62a [releng] API filter for PackIndex.DEFAULT_WRITE_REVERSE_INDEX
New static final constant is a (very minor) API break that needs to be
suppressed explicitly despite @since 6.6.

Remove a number of no longer needed API filters, and fix a broken
$NON-NLS-1$.

Change-Id: Ie4b0c45e8bd1f3067b6ff81c07d4b21b50bb8685
Signed-off-by: Thomas Wolf <twolf@apache.org>
2023-05-15 20:45:22 +02:00
Ivan Frade c40683929c Merge "UploadPack: Record negotiation stats on fetchV2 call" 2023-05-11 16:53:12 -04:00
Anna Papitto 2c89a3ec74 PackExt: add a #getTmpExtension method
During garbage collection, extensions for temporary files for indices
are formatted manually.

Add a method to PackExt to generate the temporary file extensions for
each type of index file programmatically.

Change-Id: I210bc2702e750bf0aea643b1a9a8536adebef179
Signed-off-by: Anna Papitto <annapapitto@google.com>
2023-05-11 16:49:55 -04:00
Ronald Bhuleskar d0564cf8ae UploadPack: Record negotiation stats on fetchV2 call
ServiceV2 is not collecting wants/have in PackStatistics. This records
the stats for fetch and push-negotiation.

Change-Id: Iefd79f36b3d7837195e8bd9fc7007de352089e66
2023-05-11 11:55:38 -07:00
Ivan Frade 73f9f55e3b Merge "PackWriter: write the PackReverseIndex file" 2023-05-08 15:00:46 -04:00
Anna Papitto ce88e62edc PackWriter: write the PackReverseIndex file
PackWriter offers the ability to write out the pack file and its various
index files, except for the newly introduced file-based reverse index.

Now that PackReverseIndexWriter can write reverse index files,
PackWriter#writeReverseIndex will write one for a pack if the
corresponding config flag PackConfig#writeReverseIndex is on.

Change-Id: Ib75dd2bbfb9ee9366d5aacb46700d8cf8af4823a
Signed-off-by: Anna Papitto <annapapitto@google.com>
2023-05-08 11:23:30 -07:00
Matthias Sohn 74fa245b3c Merge "Fix inProcessPackedRefsLock not shared with copies of the instance" 2023-05-03 11:10:14 -04:00
Matthias Sohn 3d90c4a433 Add TransportHttp#getAdditionalHeaders
to enable inspecting which additional HTTP headers have been set on the
transport.

Change-Id: I0771be9cb7c837de7c203b7f044109b9b2a7d7ad
2023-05-03 02:40:41 +02:00
Nasser Grainawi 06cfebd066 Fix inProcessPackedRefsLock not shared with copies of the instance
The in process lock is intended to manage contention on locking the
packed-refs file within a single process without acquiring the file
system lock. Not sharing it across RefDirectory instances of the same
repository undermines that intent and results in more contention at the
file system level.

Change-Id: I68f11856aa0b4b1524f43554d7391a322a0a6897
Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
2023-05-02 17:14:52 -06:00
Matthias Sohn 076b8e7636 Add missing @since tag to IntComparator
Change-Id: Ic190ab404ccb3af675cdd90cac231ce6e856ea68
2023-05-01 15:34:47 +02:00
Thomas Wolf 8c0c96e0a7 Support rebasing independent branches
With completely independent branches, there is no merge base. In this
case, the list of commits must include the root commit of the branch to
be rebased.

Bug: 581832
Change-Id: I0f5bdf179d5b07ff09f1a274d61c7a0b1c0011c6
Signed-off-by: Thomas Wolf <twolf@apache.org>
2023-04-29 13:24:58 +02:00
Thomas Wolf 8bc13fb79d Support cherry-picking a root commit
Handle the case of the commit to be picked not having any parents.

Since JGit implements cherry-pick as a 3-way-merge between the commit
to be picked and the target commit, using the parent of the picked
commit as merge base, this is super simple: just don't set a base tree.
The merger will not find any merge base and will supply an empty tree
iterator for the base.

Bug: 581832
Change-Id: I88985f1b1723db5b35ce58bf228bc48d23d6fca3
Signed-off-by: Thomas Wolf <twolf@apache.org>
2023-04-29 13:24:32 +02:00
Thomas Wolf 3ed4cdda6b AddCommand: ability to switch off renormalization
JGit's AddCommand always renormalizes tracked files. C git does so only
on git add --renormalize. Especially for git add . and the JGit
equivalent git.add().addFilepattern(".").call() this can make a big
difference if there are many files, or large files.

Add a "renormalize" option to AddCommand. To maintain compatibility with
existing uses, this option is "true" by default, and the behavior of
AddCommand is as it has always been in JGit.

If set to "false", use an IndexDiffFilter (in addition to a path filter,
if any). This skips any unchanged files (that are not racily clean) from
content checks. Note that changes in CRLF settings or in filters will be
ignored for such files if renormalize == false.

Add the "--renormalize" option to the Add command in the JGit command
line program. For the command-line program, the default is as in C git:
renormalize is off by default and enabled only if the option is given.
Note that --renormalize implies --update in the command line program, as
in C git. In AddCommand, the two settings are independent.

Additionally, avoid opening input streams unnecessarily in
WorkingTreeIterator.getEntryContentLength() and fix some bogus
indentation.

Add a simple test that adds 1000 files of 10kB in 10 directories twice
and that fails if the second invocation (without any changes) with
renormalize=false is not significantly faster.

Locally, I observe for that second invocation

* git.add().addFilepattern(".").call()                        ~660ms
* git.add().addFilepattern(".").setRenormalize(false).call()   ~16ms

Bug: 494323
Change-Id: I30f9d518563fa55d7058a48c27c425f3b60aeb4c
Signed-off-by: Thomas Wolf <twolf@apache.org>
2023-04-28 17:04:47 -04:00