Uploading files to Amazons S3 cloud

Uploading files to #AWS S3 using Go

My current side project is Glacial, a secure cloud-based document storage & viewing solution for long term archiving of medical records & documents. Clinical documents are encrypted locally before leaving the hospital network and are encrypted again using Amazon’s S3 server-side AES256 encryption.

This quick tutorial demonstrates how to upload your file(s) to an AWS S3 bucket using the Go programming language.

The first step is to install the AWS software development kit (SDK) for Go. This is done by using the following Go get command issued at the terminal or command prompt.

    
go get github.com/aws/aws-sdk-go/...
    

Once the AWS SDK has been installed, you’ll then need to import the relevant sections into your program to be able to interact with S3.

    
package main

import (
    "bytes"
    "fmt"
    "log"
    "net/http"
    "os"
       
    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/credentials"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/s3"
)
    

You’ll need to assign a couple of variables. These will hold the AWS region identifier and the S3 bucket name that you wish to upload files to.

    
const (
    S3_REGION = "eu-west-1"
    S3_BUCKET = "glacial-io"
)
    

Within the main part of your program, you need to create an AWS session using the NewSession function. In the example below the newly created session is assigned to the s variable. The session object is created by passing in the region identifier and your AWS credentials for id and secret key. Your id and secret key can be obtained from your AWS account. The NewSession function returns a pointer to the session object and if there was an error, i.e. you specified an invalid region, then an error struct is returned.

You should, therefore, check the status of the error variable and handle appropriately for your use case. In this example, we simply log the error to the console as a fatal message.


func main() {

    // create an AWS session which can be
    // reused if we're uploading many files
    s, err := session.NewSession(&aws.Config{
        Region: aws.String(S3_REGION),
        Credentials: credentials.NewStaticCredentials(
            "XXX",
            "YYY",
            ""),
    })
                
    if err != nil {
        log.Fatal(err)
    }
                
    err = uploadFileToS3(s, "discharge-letter-787653.pdf")

    if err != nil {
        log.Fatal(err)
    }
}

Once you’ve got a pointer to a valid AWS session you can reuse that session to upload multiple files. In the above example, we’re only using it once as we pass it to the upload function to upload a discharge letter.

The upload function

The upload function takes two parameters, the pointer to the session object and the name of the file to upload. Firstly, we attempt to open the file. If that fails, because the filename is invalid, then we return the error and exit the function. If the file does exist, then we proceed to calculate the size of the file and read the file’s bytes into a buffer.

We use the PutObject function of the SDK to upload the file. Here we can specify various S3 object parameters. For example, we specify the bucket name and the access control list (ACL) in this case we’re flagging the file as private. We also specify which S3 storage classification should be used and in this example we implement server-side encryption.


func uploadFileToS3(s *session.Session, fileName string) error {
    
    // open the file for use
    file, err := os.Open(fileName)
    if err != nil {
        return err
    }
    defer file.Close()

    // get the file size and read
    // the file content into a buffer
    fileInfo, _ := file.Stat()
    var size = fileInfo.Size()
    buffer := make([]byte, size)
    file.Read(buffer)

    // config settings: this is where you choose the bucket,
    // filename, content-type and storage class of the file
    // you're uploading
    e, s3err := s3.New(s).PutObject(&s3.PutObjectInput{
        Bucket:               aws.String(S3_BUCKET),
        Key:                  aws.String(fileName),
        ACL:                  aws.String("private"),
        Body:                 bytes.NewReader(buffer),
        ContentLength:        aws.Int64(size),
        ContentType:          aws.String(http.DetectContentType(buffer)),
        ContentDisposition:   aws.String("attachment"),
        ServerSideEncryption: aws.String("AES256"),
        StorageClass:         aws.String("INTELLIGENT_TIERING"),
    })

    return s3err
}

Storage Classifications

Besides the standard storage class STANDARD and the infrequent access class STANDARD_IA, there is now a new (as of Nov 2018) storage class called intelligent tiering. This is ideal for use cases where the file access patterns aren’t known and you’re unsure which class to use. The intelligent tiering option attempts to save you money by automatically moving files between standard and infrequent access based on usage patterns.

You can also now upload files directly to the Glacier storage engine using the value GLACIER.

If cost is an issue, and you don’t need the level of redundancy that the other storage classes offer then you can use REDUCED_REDUNDANCY to lower your monthly fees.


If you’ve found this post interesting, you may also like my Go post about using AI to get meaning from an unstructured clinical document.