In the last installment of this blog, we discussed the Hierarchical Storage Management system or HSM, and the differences between Backup and Archive. Here we will go further into the differences, specifically from the user’s perspective. Recall that an HSM:

  • automatically manages storage subsystems in a tiered virtual environment
  • continuously monitors and automatically moves files and data between different storage tiers
  • allows files and data to always appears to be online to end users and applications regardless of the actual storage location

Thus, the day-to-day use of an HSM in HPC computing is and should be transparent. User level access to files in the presence of, and in the absence of, an HSM should be identical.

[Note that even in the presence of a HSM, a Backup System is also highly recommended, if not required. With or without and HSM, I strongly recommend a good Backup System.]

In general and in contrast, Backup Systems are somewhat controlled by system administrators and generally not user accessible. They are managed with the whole of the HPC resources in mind and generally act in the background, independent of the user community to perform a “backup.” The capabilities of the Backup System are only made available to the users in the case of data loss.

Simply put, a Backup System places a copy of all files or data in a secure place and requires a process to get them out of the Backup System. In contrast, an Archive System requires a process to place copies of files or data into the archive, but, once in, the files or data are generally accessible without a prolonged process.

In the next part of this blog, we will continue to discuss HSM, Backup and Archive and their interaction. If you have comments or can contribute additional information, please feel free to do so.

