Building this blog: Image storage on S3

Building this blog: Image storage on S3


While it may be convenient to host images in a git repository, there are several reasons why its not an ideal long term solution. While there are many options, out of convenience I decided to start using AWS S3 to host my images. As a result, I wanted to share more details behind this decision and a simple guide on how you can set up your own S3 bucket to host public images.

Why Not Git?

  1. Repository Size: Each time an image is added, the repo size grows. While Git is excellent for versioning code, it’s not designed for large binary files like images. Over time, this can lead to painfully slow clone times for anyone trying to work on the repo.
  2. Versioning Issues: If you ever update an image, Git keeps a copy of both the old and new versions, further bloating the repo.
  3. Cost: Git platforms (like GitHub, GitLab, Bitbucket) often have storage limits. Exceeding them might cost you more than an S3 bucket would.

Git LFS

One alternative to storing your images directly in Git is to use Git LFS (Large File Storage). Git LFS is an extension for Git that enables the handling and versioning of large files, providing an alternative to storing such files directly in the repository by instead using pointers within the repository while storing the actual file contents on a remote server, thus keeping the repository size manageable and ensuring efficient cloning and fetching.

Cloud Object Storage

There are a number of cloud-based object storage services that will allow you to upload files and with the option to publicly host them. I saw this as a more convenient path compared to setting up Git LFS. Here just a few of the services available today:

  1. AWS S3 - a scalable cloud storage service storing and retrieving any amount of data from anywhere on the web.
  2. Google Cloud Storage (GCS): GCS is Google’s counterpart to S3. It offers similar storage capabilities, competitive pricing, and tight integration with other Google Cloud services.
  3. DigitalOcean Spaces: An object storage service that promises simplicity and speed. It integrates effortlessly with the broader DigitalOcean ecosystem and provides a CDN out of the box.
  4. Azure Blob Storage: From Microsoft’s cloud suite, it’s a versatile object storage solution with integrations across Azure’s wide range of services.

Setting Up an S3 Bucket for Public Images:

  1. Sign in to AWS Management Console
  2. Navigate to S3
  3. Create a New Bucket:
    • Click the “Create Bucket” button.
    • Choose a unique name for your bucket and select a region close to your audience.
    • Uncheck “Block all public access” (but be careful, this means the bucket is public!).
    • Acknowledge the warnings and create the bucket.
  4. Set Bucket Permissions:
    • Go to the “Permissions” tab.
    • Click on “Bucket Policy”.
    • Add a policy to allow public read access:
    {
    "Version":"2012-10-17",
    "Statement":[{
        "Sid":"PublicReadForGetBucketObjects",
        "Effect":"Allow",
        "Principal": "*",
        "Action":["s3:GetObject"],
        "Resource":["arn:aws:s3:::YOUR_BUCKET_NAME/*"]
    }]
    }
    • Replace YOUR_BUCKET_NAME with the name of your bucket.
  5. Upload Images:
    • Go to the “Objects” tab.
    • Click “Upload” and select the images you wish to upload.
  6. Access Your Images:
    • After uploading, you can access each image via: http://YOUR_BUCKET_NAME.s3-website-REGION.amazonaws.com/IMAGE_NAME

Conclusion:

If you’re blogging, especially with many images or other media, offloading them to a dedicated storage solution like Amazon S3 is a smart move. It will keep your Git repo lean and mean, and has the added benefit of providing a scalable, reliable, and cost-effective way to serve up your content.