If UploadPack or ReceivePack has an exception record an identifier
associated with the repository as part of the log message. This can
help the HTTP admin track down the offending repository and take
action to repair the root cause.
Change-Id: I58f22b33cdb40994f044a26fba9fe965b45be51d
Hooks are now obtained via a convenient API like git commands, and
callers don't have to check for their existence.
The pre-commit hook has been updated accordingly.
Change-Id: I3383ffb10e2f3b588d7367b9139b606ec7f62758
Signed-off-by: Laurent Delaigue <laurent.delaigue@obeo.fr>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* stable-3.7:
Add log4j and slf4j-log4j bridge to jgit feature
Use slf4j to log instead of printing to System.err
Use Target Platform Definition DSL to generate target platforms
Change-Id: Ic8779868150c910fa55fd20348e35723e6add0f1
Core classes to parse and process .gitattributes files including
support for reading attributes in WorkingTreeIterator and the
dirCacheIterator.
The implementation follows the git ignore implementation. It supports
lazy reading attributes while walking the working tree.
Bug: 342372
CQ: 9078
Change-Id: I05f3ce1861fbf9896b1bcb7816ba78af35f3ad3d
Also-by: Marc Strapetz <marc.strapetz@syntevo.com>
Also-by: Gunnar Wagenknecht <gunnar@wagenknecht.org>
Also-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Gunnar Wagenknecht <gunnar@wagenknecht.org>
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The current IgnoreRule/FileNameMatcher implementation scales not well
with huge repositories - it is both slow and memory expensive while
parsing glob expressions (bug 440732). Addtitionally, the "double star"
pattern (/**/) is not understood by the old parser (bug 416348).
The proposed implementation is a complete clean room rewrite of the
gitignore parser, aiming to add missing double star pattern support and
improve the performance and memory consumption.
The glob expressions from .gitignore rules are converted to Java regular
expressions (java.util.regex.Pattern). java.util.regex.Pattern code can
evaluate expression from gitignore rules considerable faster (and with
less memory consumption) as the old FileNameMatcher implementation.
CQ: 8828
Bug: 416348
Bug: 440732
Change-Id: Ibefb930381f2f16eddb9947e592752f8ae2b76e1
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
It's an internal package which isn't part of the API. Mark it x-internal
to silence @since tag warnings which are only raised for new API.
Bug: 440757
Change-Id: Id05deaca43f135cd1bfe83cf1f29787cbbdbecac
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
- don't mark them as singleton to allow coexistence of multiple versions
in the same installation
- add missing version qualifier to Eclipse-SourceBundle header
see
https://dev.eclipse.org/mhonarc/lists/cross-project-issues-dev/msg10524.html
Change-Id: Ie4e028038f5a1d3e18b0be06c3d2ea82e7f9068d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This move avoids that all consumers of org.eclipse.jgit depend on Apache
httpclient. Also add another feature to make this optional for OSGi
consumers as well.
Change-Id: I5ef5e00c53678b9e1d7cfd54bbca3ff6f1c1c967
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This change implements the http connection abstraction with the help of
org.apache.http.client.HttpClient. The default implementation used by
JGit is still the JDK HttpURLConnection. But now JGit users have the
possibility to switch completely to org.apache.httpclient. The reason
for this is that in certain (e.g. cloud) environments you are forced to
use the org.apache classes.
Change-Id: I0b357f23243ed13a014c79ba179fa327dfe318b2
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Previously all HTTP communication was done with the help of
java.net.HttpUrlConnection. In order to make JGit usable in environments
where the direct usage of such connections is not allowed but where the
environment provides other means to get network connections an
abstraction for connections is introduced. The idea is that new
implementations of this interface will be introduced which will not use
java.net.HttpUrlConnection but use e.g.
org.apache.client.http.HttpClient to provide network connections.
One example: certain cloud infrastructures don't allow that components
in the cloud communicate directly with HttpUrlConnection. Instead they
provide services where a component can ask for a connection (given a
symbolic name for the destination) and where the infrastructure returns
a preconfigured org.apache.http.client.HttpClient. In order to allow
JGit to be running in such environments we need the abstraction
introduced in this commit.
Change-Id: I3b06629f90a118bd284e55bb3f6465fe7d10463d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Package was renamed, so I had to update the imports. Also, I verified
bitmap serialization was still compatible.
Change-Id: I161ad3875b963b56001beab477ef8d072accee4f
This silences some discouraged access warnings issued since
TestRepository uses PackWriter which is in an internal package.
Change-Id: Ic9c4631e237c2fe1996c518328ecc2a9ab5c348b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This should prevent class cast problems caused by jgit and egit bundles
wiring to different versions of com.jcraft.jsch.
Bug: 420903
Change-Id: Icabe40209ea07369e2b7eee31952d131aef3fbf1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The dash signing plugin has been retired hence we need to update our
build to use the CBI jarsigner plugin for signing build results.
Pack test classes to enable signing them.
Also re-enable pack200 for bundle org.eclipse.jgit.
WORKAROUND: there is no easy way to run tests with maven-surefire-plugin
from signed test-jar so for a quick workaround we will have to add a
build step on Hudson so that we can run tests before signing:
- first step will do "clean, verify" to compile and run tests
- second step will do "install, deploy" with profile "eclipse-sign" and
use -DskipTests=true to skip tests since they would hit a
SecurityException when unsigned test classes are in same package as
signed classes under test
- third step will do "clean, install, deploy" on packaging reactor to
build features and p2 repository with profile "eclipse-sign" to sign
and pack200 all bundles.
TODO: Tycho doesn't suport picking up pack200 artifacts via
pomDependencies hence we need to find a way to copy them manually and
use tycho-extra's tycho-p2-extras-plugin:publish-features-and-bundles
to generate the missing p2 metadata.
Change-Id: Iec2c5ab3027a3e3f9ecc0d2f99193385177d9025
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
DirCacheCheckout had a bug when the parentdirectory of a worktree was a
symlink. DirCacheCheckout was deleting those symlinks under certain
conditions. This was fixed in I81735ba0394ef6794e9b2b8bdd8bd7e8b9c6460f
without a test because previously it was hard to setup tests containing
symlinks.
BUG: 412489
Change-Id: I2513166af519d6fc01d1eae3976ad6cff6f98530
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This breaks all existing callers once. Applications are not supposed
to build against the internal storage API unless they can accept API
churn and make necessary updates as versions change.
Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9
A pack bitmap index is an additional index of compressed
bitmaps of the object graph. Furthermore, a logical API of the index
functionality is included, as it is expected to be used by the
PackWriter.
Compressed bitmaps are created using the javaewah library, which is a
word-aligned compressed variant of the Java bitset class based on
run-length encoding. The library only works with positive integer
values. Thus, the maximum number of ObjectIds in a pack file that
this index can currently support is limited to Integer.MAX_VALUE.
Every ObjectId is given an integer mapping. The integer is the
position of the ObjectId in the complete ObjectId list, sorted
by offset, for the pack file. That integer is what the bitmaps
use to reference the ObjectId. Currently, the new index format can
only be used with pack files that contain a complete closure of the
object graph e.g. the result of a garbage collection.
The index file includes four bitmaps for the Git object types i.e.
commits, trees, blobs, and tags. In addition, a collection of
bitmaps keyed by an ObjectId is also included. The bitmap for each entry
in the collection represents the full closure of ObjectIds reachable
from the keyed ObjectId (including the keyed ObjectId itself). The
bitmaps are further compressed by XORing the current bitmaps against
prior bitmaps in the index, and selecting the smallest representation.
The XOR'd bitmap and offset from the current entry to the position
of the bitmap to XOR against is the actual representation of the entry
in the index file. Each entry contains one byte, which is currently
used to note whether the bitmap should be blindly reused.
Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
Otherwise loading javax.net.ssl.TrustManager fails if
osgi.compatibility.bootdelegation=false which became the Equinox default
since bug 344850 was fixed.
Bug: 392056
Change-Id: I464871723649095942dbf77da93890ac8ec39075
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
This prevents a lot of unnecessary warnings about disouraged usage of
the org.eclipse.jgit.internal package within JGit itself.
Change-Id: Ia6683902809425fd7245e7d5d344c2ff8f317ebb
The package was removed in I763590a45d75f00a09097ab6f89581a3bbd3c797
Change-Id: Ifa9e75714f85d17609f9bf61581aaed0631a6fa7
Signed-off-by: Kevin Sawicki <kevin@github.com>
It seems pack200 became unable to correctly pack the bundle
org.eclipse.jgit (see bug 372845). Hence mark it to be excluded from
this packing step following the workaround which worked for
org.eclipse.jst.jsf.core (bug 335806).
Bug: 372845
Change-Id: I2e3d20645ac49125472ddc235afbe9f3c7480caf
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Adds the following commands:
- Add
- Init
- Status
- Sync
- Update
This also updates AddCommand so that file patterns added that
are submodules can be staged in the index.
Change-Id: Ie5112aa26430e5a2a3acd65a7b0e1d76067dc545
Signed-off-by: Kevin Sawicki <kevin@github.com>
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
In practice the DHT storage layer has not been performing as well as
large scale server environments want to see from a Git server.
The performance of the DHT schema degrades rapidly as small changes
are pushed into the repository due to the chunk size being less than
1/3 of the pushed pack size. Small chunks cause poor prefetch
performance during reading, and require significantly longer prefetch
lists inside of the chunk meta field to work around the small size.
The DHT code is very complex (>17,000 lines of code) and is very
sensitive to the underlying database round-trip time, as well as the
way objects were written into the pack stream that was chunked and
stored on the database. A poor pack layout (from any version of C Git
prior to Junio reworking it) can cause the DHT code to be unable to
enumerate the objects of the linux-2.6 repository in a completable
time scale.
Performing a clone from a DHT stored repository of 2 million objects
takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row
for each object being cloned. This is very difficult for some DHTs to
scale, even at 5000 rows/second the lookup stage alone takes 6 minutes
(on local filesystem, this is almost too fast to bother measuring).
Some servers like Apache Cassandra just fall over and cannot complete
the 2 million lookups in rapid fire.
On a ~400 MiB repository, the DHT schema has an extra 25 MiB of
redundant data that gets downloaded to the JGit process, and that is
before you consider the cost of the OBJECT_INDEX table also being
fully loaded, which is at least 223 MiB of data for the linux kernel
repository. In the DHT schema answering a `git clone` of the ~400 MiB
linux kernel needs to load 248 MiB of "index" data from the DHT, in
addition to the ~400 MiB of pack data that gets sent to the client.
This is 193 MiB more data to be accessed than the native filesystem
format, but it needs to come over a much smaller pipe (local Ethernet
typically) than the local SATA disk drive.
I also never got around to writing the "repack" support for the DHT
schema, as it turns out to be fairly complex to safely repack data in
the repository while also trying to minimize the amount of changes
made to the database, due to very common limitations on database
mutation rates..
This new DFS storage layer fixes a lot of those issues by taking the
simple approach for storing relatively standard Git pack and index
files on an abstract filesystem. Packs are accessed by an in-process
buffer cache, similar to the WindowCache used by the local filesystem
storage layer. Unlike the local file IO, there are some assumptions
that the storage system has relatively high latency and no concept of
"file handles". Instead it looks at the file more like HTTP byte range
requests, where a read channel is a simply a thunk to trigger a read
request over the network.
The DFS code in this change is still abstract, it does not store on
any particular filesystem, but is fairly well suited to the Amazon S3
or Apache Hadoop HDFS. Storing packs directly on HDFS rather than
HBase removes a layer of abstraction, as most HBase row reads turn
into an HDFS read.
Most of the DFS code in this change was blatently copied from the
local filesystem code. Most parts should be refactored to be shared
between the two storage systems, but right now I am hesistent to do
this due to how well tuned the local filesystem code currently is.
Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
BlameGenerator digs through history and discovers the origin of each
line of some result file. BlameResult consumes the stream of regions
created by the generator and lays them out in a table for applications
to display alongside of source lines.
Applications may optionally push in the working tree copy of a file
using the push(String, byte[]) method, allowing the application to
receive accurate line annotations for the working tree version. Lines
that are uncommitted (difference between HEAD and working tree) will
show up with the description given by the application as the author,
or "Not Committed Yet" as a default string.
Applications may also run the BlameGenerator in reverse mode using the
reverse(AnyObjectId, AnyObjectId) method instead of push(). When
running in the reverse mode the generator annotates lines by the
commit they are removed in, rather than the commit they were added in.
This allows a user to discover where a line disappeared from when they
are looking at an older revision in the repository. For example:
blame --reverse 16e810b2..master -L 1080, org.eclipse.jgit.test/tst/org/eclipse/jgit/storage/file/RefDirectoryTest.java
( 1080) }
2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1081)
2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1082) /**
2302a6d3 (Christian Halstrick 2011-05-20 11:18:20 +0200 1083) * Kick the timestamp of a local file.
Above we learn that line 1080 (a closing curly brace of the prior
method) still exists in branch master, but the Javadoc comment below
it has been removed by Christian Halstrick on May 20th as part of
commit 2302a6d3. This result differs considerably from that of C
Git's blame --reverse feature. JGit tells the reader which commit
performed the delete, while C Git tells the reader the last commit
that still contained the line, leaving it an exercise to the reader
to discover the descendant that performed the removal.
This is still only a basic implementation. Quite notably it is
missing support for the smart block copy/move detection that the C
implementation of `git blame` is well known for. Despite being
incremental, the BlameGenerator can only be run once. After the
generator runs it cannot be reused. A better implementation would
support applications browsing through history efficiently.
In regards to CQ 5110, only a little of the original code survives.
CQ: 5110
Bug: 306161
Change-Id: I84b8ea4838bb7d25f4fcdd540547884704661b8f
Signed-off-by: Kevin Sawicki <kevin@github.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Using a resolver and factory pattern for the anonymous git:// Daemon
class makes transport.Daemon more useful on non-file storage systems,
or in embedded applications where the caller wants more precise
control over the work tasks constructed within the daemon.
Rather than defining new interfaces, move the existing HTTP ones
into transport.resolver and make them generic on the connection
handle type. For HTTP, continue to use HttpServletRequest, and
for transport.Daemon use DaemonClient.
To remain compatible with transport.Daemon, FileResolver needs to
learn how to use multiple base directories, and how to export any
Repository instance at a fixed name.
Change-Id: I1efa6b2bd7c6567e983fbbf346947238ea2e847e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>