Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: adding links to utility documentation

Digital Antiquity Backup Utility

...

This document describes Digital Antiquity’s procedures it’s archival and retrieval procedures for Digital Antiquity assets, including: the tDAR resource filestore,  tDAR PostgreSQL metadata database,  and Digital Antiquity websites.


Note: This document is not an installation guide or a tutorial.  Installation, usage, and configuration instructions are available on Digital Antiquity's Bitbucket site.

 

Process Summary


Digital Antiquity will perform a routine full backup of assets and transfer these assets to the Amazon Glacier service (by way of Amazons S3-to-Glacier) utility.   Digital Antiquity will augment these full-backups with smaller,  differential backups that occur on a more frequent schedule.  Digital Antiquity will compress & encrypt all data prior to sending it to Glacier.

...

We plan to use Amazon’s automated S3-to-Glacier transfer functionality.  More information can be found here: https://aws.amazon.com/blogs/aws/archive-s3-to-glacier/.


Amazon Glacier is Amazon’s data archival service.   Glacier provides low-cost, durable storage that is tailored for data archival and backup services.  

...

Amazon S3 is a near-realtime online storage service.  While it can serve as a backup destination, it is more tailored for low-latency & high-availability file access and S3 pricing reflects this.  S3 has been in service longer than Glacier, and benefits from a good selection of mature 3rd-party file transfer utilities.



Amazon Primers

Amazon S3 Filesystem Layout

  • Buckets  

    • Top-level container

    • Container for objects

    • Non-hierarchical (no buckets in buckets)

  • Objects

    • Essentially files

    • Have name, permissions.

  • Folders

    • Hierarchical

    • Don’t really exist. Serve as a construct when downloading and visualizing.

    • Internally, just a prefix prepended to object name.

  • Limits

    • Unlimited # of buckets

    • Unlimited # of objects per bucket

    • Max object size: 5TB

...

  • Vault

    • top level container, akin to s3 bucket

  • Archive

    • Roughly akin to s3 object.

  • Limits: effectively none, for a filesystem of our size (or a filesystem 1000x our size)



How Manifests Work

How S3-to-Glacier works

How to copy to S3


Suggested S3 File Layout

  • The basics

    • One bucket per “app” (e.g. “tdar filesystem”,   “postgres”, “jira”, etc)

    • snapshot contained in “snapshot” subfolder

    • differential backups in “diffs” subfolder.

    • each folder contains:

      • one object containing manifest file(s)

      • one object containing winterized backup + manifest file(s)

Example Layout

...