Raid-to-Silo Transfer Strategy

Below is a proposal for a raid-to-silo transfer strategy for moving Hall D data files from our local raid server to the JLab tape storage facility. We will update this as our ideas develop.

Elliott Wolin

Dave Lawrence

24-Oct-2013

Notes

We will use the jmirror facility from the Computer Center to transfer the files.
jmirror deletes the link to the file when the transfer is complete. It does not delete directories, only files.
jmirror is fairly smart and reliable. It only deletes the hard link when the file is safely transferred.
CRON jobs will delete unneeded dirs after their contents are safely transferred.
jmirror is run periodically via a CRON job, it is not a tranfer server system. It transfers files it finds when it is run.
jmirror will not transfer files actively being written to, nor transfer files twice if invoked twice.
Additional hard links to the data file are untouched by jmirror. These can be used to keep the file on disk after transfer.
If files are kept they will be deleted "just-in-time" to make room for new DAQ files. This will require cleanup strategy and cron scripts to implement it.
The DAQ creates a 10 GB file every 30 secs, about 1 TB/hour. Thus a two hour run generates 2 TB.
It is preferable to transfer files as they are ready for transfer, and not wait for the run to end before initiating transfer.
The simplest way to implement immediate transfer is for run control to run a script every time the ER closes a file.
Vardan and Carl are working out a simple scheme to allow users to specify such a script and have it run when a file is closed.
Mark I prefers to store files by "run period" with a simple naming scheme (RunPeriod001, RunPeriod002 or similar).
Run periods are just date ranges. Run numbers will NOT be reused, i.e. all run numbers are unique across all run periods.
Due to constraints in the mss a second level of directories is needed. Mark and I propose simply organizing files by run, e.g. something like Run000001, Run000002, etc.
Run files will have the run number in them, e.g something like: Run000001.evio.001, Run000001.evio.002, etc.
A two-hour run will generate around 250 files.
The RAID sytem stripes data across all disks, independent of logical partitioning.
RAID disk partitions do not seem to be needed (see below), they can be implemented later if necessary.
mv and ln cannot create hard links across partitions, files have to be physically copied to put them on a different partition.
The raid server must simultaneously read and write at 300 MB/s, it's best to avoid additional file copying.
Note that we have two completely independent RAID servers, 75 TB each.

Notes for Dec 2013 Online Data Challenge

We plan to use a basic autmomated file transfer mechanism in Dec that deletes files on transfer. If someone has the time we'll try just-in-time deletion.

Proposal

Raid-to-Silo Transfer Strategy

Navigation menu

Views

Personal tools

Navigation

Search

Tools