Storage - System Overview
Introduction
We have 500 - 700 GB of data to be stored and saved to tape & HDDs each day, therefore we have to design a proper pipeline.
This section has two parts: utilizing the tape and structure.
Utilizing the Tape
Loading a Tape Cartridge
Before a tape can be read, it has to be loaded into the tape drive first. This can be easily done using MSL2024's web management interface.
Data Storage on Tape
Before data can be stored on tapes, I have to figure out how it works and how much data could be stored on each tape. I have connected the tape drive to compression server A using LC-LC fiber. Also, the library management interface was connected via RJ45 cable as well.
Based on instructions found on HP's website, I downloaded HPE Library and Tape Tools, HPE LTFS Cartridge Browser, and HPE LTFS Configuration Tool. LTFS stands for Linear Tape File System, which is required to store data on the tape. This file system, due to the physical limitation of the tape, does not support data deletion. When a file is removed, it will disappear in the file system, but the space it occupies will not be released. The only way to reclaim the space is to format the entire tape.
Tape Capacity
The tape we are using is LTO6 which has a capacity of 6.25 TB per tape. However, since LTO has a built-in compression strategy, 6.25TB refers to data size before compression. In our case, where the data is already compressed, it is unlikely we can store 6 TB per tape.
Even though the OS shows each tape is capable of storing 2.29 TB of data, the information is not as accurate as for an HDD. After quick testing, I found each tape can handle about ~2.10 - 2.25 TB data.
Disabling Compression
During the test, I found the data transferring speed to be very unstable. it ranges from 160 MB/s to 0KB/s, which is confusing for a linear file system without a cache, and I guess it could be due to the compression. I decided to turn off the compression. However, the problem still exists after compression was turned off. In addition, the amount of data that could be stored in tape does not change after compression was turned off, and I decided to leave it turned off.
Copying Tools
Since compression is not the problem, I suspect it could be a Windows Explorer problem. Tape is a very rare storage medium today, and Windows is very good at doing "smart tricks underneath" so it is likely that Windows is somehow incompatible with the tape. As a result, I tried FastCopy, which is a better tool than Windows Explorer for large-scale data transfering.
The result, however, is even worse. FastCopy is never designed for a tape system, and mechanisms like parallel transferring could degrade the performance severely on a linear file system. Since the total amount of data copied to the tape using Windows File Explorer each day can satisfy our needs, I decided to leave this problem for now.
Structure
Since copying file to tape requires hours to finish, we have to prepare an additional day for each batch of data could be processed.