Commit Graph

53 Commits

Author SHA1 Message Date
Saša Živkov b72fc2b494 Make the FileLfsRepository thread safe
The FileLfsRepository.out member could have been accessed from multiple
threads which would corrupt the content.

Don't store the AtomicObjectOutputStream in the FileLfsRepository.out but
move it to the ObjectUploadListener which is instantiated per-request.

Add a parallel upload test.

Change-Id: I62298630e99c46b500d376843ffcde934436215b
Signed-off-by: Saša Živkov <sasa.zivkov@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
2016-03-22 17:26:46 +01:00
Matthias Sohn f496177a37 Support Amazon S3 based storage for LFS
Add a storage implementation storing large objects in Amazon S3.
The AmazonS3Repository pre-signs download and upload requests.

AWS access and secret key are expected to be in the
$HOME/.aws/credentials file in the following format:

[default]
  accessKey = ...
  secretKey = ...

Use AWS version 4 request signing [1] because it is more secure and
supported by all regions. The version 3 signing is not supported in
newer regions.

In follow up changes we should:

- implement getVerifyAction() and do actual verification. Subclasses of
S3Repository can implement caching for object meta data (size) in order
to avoid extra roundtrips to S3. Verification should ensure that meta
data store and content of S3 storage are in sync

- HEAD request used in S3Repository.getSize() seems to always return
Content-length 0 in contrast to the documentation [2]. So getSize() does
detect if the object exists in S3 or not but in case the object exists
it always returns size 0

[1] http://docs.aws.amazon.com/general/latest/gr/signature-version-4.html
[2] https://forums.aws.amazon.com/thread.jspa?threadID=223616

Change-Id: Ic47f094928a259e5264c92b3aacf6d90210907a8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
2016-02-17 11:36:58 -05:00
Matthias Sohn 3bae524f6f Support LFS protocol and a file system based LFS storage
Implement LfsProtocolServlet handling the "Git LFS v1 Batch API"
protocol [1]. Add a simple file system based LFS content store and the
debug-lfs-store command to simplify testing.

Introduce a LargeFileRepository interface to enable additional storage
implementation while reusing the same protocol implementation.

At the client side we have to configure the lfs.url, specify that
we use the batch API and we don't use authentication:

  [lfs]
	  url = http://host:port/lfs
	  batch = true
  [lfs "http://host:port/lfs"]
	  access = none

the git-lfs client appends the "objects/batch" to the lfs.url.

Hard code an Authorization header in the FileLfsRepository.getAction
because then git-lfs client will skip asking for credentials. It will
just forward the Authorization header from the response to the
download/upload request.

The FileLfsServlet supports file content storage for "Large File
Storage" (LFS) server as defined by the Github LFS API [2].

- upload and download of large files is probably network bound hence use
  an asynchronous servlet for good scalability
- simple object storage in file system with 2 level fan-out
- use LockFile to protect writing large objects against multiple
  concurrent uploads of the same object
- to prevent corrupt uploads the uploaded file is rejected if its hash
  doesn't match id given in URL

The debug-lfs-store command is used to run the LfsProtocolServlet and,
optionally, the FileLfsServlet which makes it easier to setup a
local test server.

[1]
https://github.com/github/git-lfs/blob/master/docs/api/http-v1-batch.md
[2] https://github.com/github/git-lfs/tree/master/docs/api

Bug: 472961
Change-Id: I7378da5575159d2195138d799704880c5c82d5f3
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
2016-02-04 17:49:43 +01:00