AWS S3 allows us to store and retrieve files in the cloud; we have cheap and virtually unlimited space.
S3 stores files in buckets, which have a globally-unique name; we can create a bucket using cloudformation;
1. creating the bucket
The simplest cloudformation template would just define one resource, of type AWS::S3::Bucket. The only required property is BucketName, which is a globally unique name (S3 is one of the oldest AWS services, and they went with global names, rather than per-account).
So, a simple CF template would look like: (remember the bucket name has to be unique, so replace it with a name of your own if you run this).
AWSTemplateFormatVersion: '2010-09-09'
Description: Simple S3 Bucket
Resources:
FirstS3Bucket:
Type: AWS::S3::Bucket
Properties:
BucketName: okaram-simple-s3-bucket
2. Downloading a file
AWS provides a REST API for most of its services, and the Java SDK tends to wrap that API with a common pattern; you have a client interface and class, and methods that take request objects.
For S3, the client interface is AmazonS3 and the client class is AmazonS3Client. We create a GetObjectRequest', specifying the bucket and the file name, and then call `getObject on the client.
For our groovy code, we use @Grab to download the sdk parts we need; We can download the full java sdk, or just the pieces we need; for this example, I’m just using 'com.amazonaws:aws-java-sdk-s3'.
So, a program to download a file would be as follows (we take the bucket and file as program arguments)
@Grab('com.amazonaws:aws-java-sdk-s3:+')
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.model.GetObjectRequest;
String bucketName=args[0]
String remoteFileName=args[1]
AmazonS3 s3Client = new AmazonS3Client();
GetObjectRequest request= new GetObjectRequest(bucketName, remoteFileName);
S3Object object = s3Client.getObject(request);
// s3Client also has a getObject that directly takes the bucket and file names, if you prefer
InputStream objectData = object.getObjectContent();
objectData.eachLine { line -> println(line)}
objectData.close();
3. Uploading a file
Uploading through S3Client directly involves uploading a file in several parts, and then assembling the parts, which makes it much more convenient to use a TransferManager, which takes care of breaking the file in parts and uploading all the parts (you can also use it to download files). A simple file uploader would look as follows:
@Grab('com.amazonaws:aws-java-sdk-s3:+')
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import java.io.File;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.Upload;
String bucketName=args[0]
String remoteFileName=args[1]
String localFileName=args[2]
TransferManager tm = new TransferManager();
// TransferManager processes all transfers asynchronously,
// so this call will return immediately.
Upload upload = tm.upload(bucketName, remoteFileName, new File(localFileName));
// this blocks until the transfer is done
upload.waitForCompletion();
System.out.println("Upload complete.");
// the TransferManager creates threads, so if we don't exit it will run forever
tm.shutdownNow();
4. Listing all files in a bucket
S3 does not really have folders, (but we can use any separator (usually /) within file names to simulate folders).
A program to list the files in a bucket, without regard of any folders (would include ALL files, regardles of which folder it belongs to) is shown below. It also illustrates a common pattern within the AWS API and SDK; if an operation can return a list of infinite (well, unbound) size, then the API will return just a portion of the list, and then a continuation token; if we get a continuation token, then we know that we need to call again, passing that token, to get more elements from the list.
Here we use a ListObjectsV2Request
@Grab('com.amazonaws:aws-java-sdk-s3:+')
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.ListObjectsRequest;
import com.amazonaws.services.s3.model.ListObjectsV2Request;
import com.amazonaws.services.s3.model.ListObjectsV2Result;
import com.amazonaws.services.s3.model.ObjectListing;
import com.amazonaws.services.s3.model.S3ObjectSummary;
String bucketName=args[0]
AmazonS3 s3client = new AmazonS3Client();
final ListObjectsV2Request request = new ListObjectsV2Request()
.withBucketName(bucketName)
.withMaxKeys(2);
ListObjectsV2Result result;
while(true){
result = s3client.listObjectsV2(request);
for (S3ObjectSummary objectSummary :
result.getObjectSummaries()) {
System.out.println(" - " + objectSummary.getKey() + " " +
"(size = " + objectSummary.getSize() +
")");
}
request.setContinuationToken(result.getNextContinuationToken());
if(!result.isTruncated())
break;
}
5. Bucket with web configuration and authorization for everybody to read
AWSTemplateFormatVersion: '2010-09-09'
Description: My s3 buckets
Resources:
FirstS3Bucket:
Type: AWS::S3::Bucket
DeletionPolicy: Retain
Properties:
VersioningConfiguration:
Status: Enabled
BucketName: okaram-first-s3-bucket
CorsConfiguration:
CorsRules:
-
AllowedOrigins:
-
'*'
AllowedMethods:
-
'GET'
WebsiteConfiguration:
ErrorDocument: String
IndexDocument: String
AllReadBucketPolicy:
Type: AWS::S3::BucketPolicy
Properties:
PolicyDocument:
Id: MyPolicy
Version: '2012-10-17'
Statement:
- Sid: ReadAccess
Action:
- s3:GetObject
Effect: Allow
Resource: 'arn:aws:s3:::okaram-first-s3-bucket/*'
Principal: '*'
Bucket: !Ref FirstS3Bucket
6. TODO
-
ls with folders, withDelimiter/withPrefix , getCommonPrefixes
-
Permission policies with CF
-
Deletion policies etc
-
Temporary access URLs