Difference between revisions of "Using the Grid"
(→Building and Running your Own Jobs)
(→Building and Running your Own Jobs)
|Line 83:||Line 83:|
'''Please observe this general rule:''' ''test your executables and scripts on your local machine before shipping them out to the grid
'''Please observe this general rule:''' ''test your executables and scripts on your local machine before shipping them out to the grid'' Having 5000 jobs hanging and crashing out in Gridland can waste a lot of your time in recovery and cleanup, and bother other users and site admins as well. The key feature of gridmake is that it provides the same execution environment when it runs locally as it does when it runs on a grid worker node. An ounce of prevention...
Revision as of 21:28, 1 January 2013
A Quick Start Howto for GlueX Members
The GlueX collaboration is registered as a Virtual Organization (VO) within the Open Science Grid (OSG). One requirement for all VO's is that someone within the collaboration be the primary point of support for the members of the VO, and provide both documentation and assistance in problem-solving for users of grid resources and tools within the collaboration. I am that point of support for GlueX members, for the period leading up to the start of physics data taking. In return for the support I provide to Gluex VO users, I have direct access to experts within the OSG central support to help with issues that are beyond my control or expertise. This document is the starting point for my support to GlueX members. Feedback on the accuracy, organization, and general usefulness of this page will be appreciated. GlueX members who have read this document can use the above email link to request any follow-up assistance that they require.
Who should read this document?
This document was written for those directly involved in the production of large-scale physics simulation data sets, and the end-users who want easy access to these data for carrying out physics analyses. In the past this has primarily been graduate students. Blake Leverington has written a detailed beginner's guide based on his experience, and Jake Bennett has provided a separate instructions page addressing issues he encountered that were not covered by Blake. Those pages will be useful to those who find this document assumes too much background knowledge, or uses too much grid jargon. On the other hand, if you just want to get started with a minimum of knowledge of the inner workings of the grid, and don't need every term defined and concept explained, this document is the place to start.
Before you can start to use the grid, you need to install two objects in your login environment: your personal grid certificate, and the OSG Client Software bundle. The OSG Client is already installed for your use on the Jlab ifarm1101 and ifarm1102 machines in the /apps area. Simply source /apps/osg/osg-client/setup.[c]sh to include the tools in your path. Personally I find that there are things in the OSG Client that mess with the standard Jlab CUE environment (like its own release of the python interpreter) so I set up a one-line shell script called "osg" that I put in my ~/bin directory, so that I can type the command "osg <command> <args>..." and have <command> execute with the OSG Client tools ahead of everything else in my path.
Installing the OSG Client bundle on your own Linux/Mac desktop is a good idea, but it is beyond the scope of this document. I have collected a set of instructions for installing the grid tools on Mac OS/X from various sources on the web. The OSG Grid Operations Center (GOC) distributes a standard set of RPM's that make it extremely simple to install the OSG Client on a Linux desktop. Root access is required. The OSG Client is not presently available for Microsoft Windows.
If you have never had a grid certificate, go to this separate page for instructions on how to obtain one and have it registered with the Gluex VO. The certificate itself is public, but the private key that comes with it should be carefully guarded and protected with a secure password. When the two are bundled together in a single file encrypted with a password, it is called a 'PKCS12' file (file extension .p12, or sometimes .pfx on Windows). Import this file into your favorite browser(s) so that you can use it to authenticate yourself to secure web services when needed, and into your email client so you can use it to attach your own digital signature to email messages when requested. However, the most important use of your certificate will be when you use it to generate a proxy. For that to work, save a copy of your .p12 file at ~/.globus/usercred.p12 in your home directory on your linux/Mac work machine.
A proxy is an authorization from a grid security hub that identifies you as a valid member of the Gluex VO and authorizes you for access to grid resources. You can do nothing on the grid without your proxy. The proxy resides within your unix (or Mac OS/X) shell environment, and gets picked up by grid tools and passed around the network together with your service requests. Generate your proxy with a command like the following.
$ voms-proxy-init -voms Gluex:/Gluex -valid 24:0
The VOMS (VO Membership Service) is a grid service that knows about the Gluex VO. Note that Gluex (lowercase x) is an OSG entity, whereas GlueX is the scientific collaboration that owns it. The string Gluex:/Gluex directs the proxy builder to contact the Gluex VOMS server (Gluex:) and ask for basic rights as a Gluex member. A longer string like Gluex:/Gluex/production or Gluex:/Gluex/software/role=admin would grant you extra privileges, but you do not need these to get started. The validity lifetime string 24:0 means that the proxy will be good for 24 hours and 0 minutes, starting from when it is approved. It should take less than 5 seconds to create or renew a proxy with the above command. You can renew it at any time. For safety, when you are done with it you can destroy it with the command voms-proxy-destroy.
Accessing stored data over SRM
The Storage Resource Manager (SRM) is a standard interface for accessing data that are stored on the grid. Individual files and directories are identified in a similar fashion to web pages, by a URL. The SRM server does not actually contain the data, but acts as a redirector that instructs the grid data transport layers where any particular item of data is stored (can be multiple places) and what the preferred protocols are for accessing it at each location. For example, imagine that the results of a grid simulation have been made available in a SRM folder called dc1.1-12-2012 that is visible at server grinch.phys.uconn.edu. If one knows the name of a data file, it can be listed with its size using a command like the following.
This dc1.1-12-2012 folder could contain several hundred thousand files, so do not attempt to list the entire directory using srmls. Other means exist that are much more efficient for browsing the grid file catalog. Only use srmls if you know the name of the file(s) you want to access. Having confirmed that the above file dana_rest_1000596.hddm exists on the srm at the above location, one can then fetch a copy of it using a command like the following.
srmcp srm://grinch.phys.uconn.edu/Gluex/dc1.1-12-2012/dana_rest_1000596.hddm \ file:///`pwd`/dana_rest_1000596.hddm
This command pulls a copy of the file from the SRM to your local working directory. This is fine for many situations, and would be preferred in cases where one plans to read the data multiple times.
An alternative that should be considered in some cases would be to leave the data on the SRM and "mount" the directory on your local machine. The grid client software provides this functionality through a feature of the Linux kernel called FUSE (filesystem in userspace). The following command creates a new mount point xxx and uses it to mount the SRM directory referenced above within the user's local work space.
mkdir xxx gfalFS xxx srm://grinch.phys.uconn.edu/srm/managerv2?SFN=/Gluex/dc1.1-12-2012
Files that reside in that directory on the SRM can now be opened and read by local programs as if they were on the local disk. When finished, the user must unmount the SRM folder as follows.
The FUSE access method avoids the need for allocating large work disk areas for staging files to be analyzed when all one wants to do is to read through the data once and skim off the relevant bits for a particular analysis. If one plans to read through the entire data set multiple times, it probably makes sense to download it once to a staging area and access it locally, rather than stream dozens of terabytes multiple times over the internet. Better yet, users can avoid loading their local network altogether by submitting their skim application to run on the grid using grid job submission tools (see below), and only download the results of the skim when it is finished. At this point there are no fixed policies for data access on SRM by Gluex members. Users can experiment with different methods and adopt those that make the most sense and work best for their own problem.
Nothing prevents someone from mounting the above directory and trying to do "ls -l xxx" to list all the files in the dc1.1-12-2012 directory. However, this command will hang for a very long time and probably time out because there are many thousands of files in this SRM folder, and the SRM does not handle long file lists efficiently. The grid file catalog provides tools for efficiently browsing the grid data directory in real time. Until the time that Gluex has our own grid file catalog service up and running, you can find full directory listings in standard unix "ls -l" style in a plain text file called .ls-l (note the leading dot) located in each directory. For example, to see what would have been the output of "ls -l xxx" if the command had completed, you can type "cat xxx/.ls-l". This operation is fast, and imposes practically no load on SRM. The listings are updated on a regular basis.
Running jobs on the grid
For some users and groups, the tools described above for accessing data on the grid will be all they want. For example, to perform a skim over a simulated data set, they will download the full set of simulated data files to a local disk and then run the skim on their local machine or cluster. However, in terms of time and efficiency, it would make more sense to send the skimmer code to the data than to bring the data to the code. The full resources of the OSG are available to you if you are willing to try the more efficient option. This section explains how this is done.
In the above steps, you have already installed your user grid certificate and the OSG Client software. Use the following commands to initialize a grid proxy for this session, and verify that the Gluex front-end to the OSG job factory is ready to accept your jobs. In this example, I request a proxy that is good for 72 hours. Whenever you do this, be sure that you create it with a sufficiently long lifetime that it will not expire before your jobs are finished, because any job that tries to execute past the lifetime of the proxy will fail.
voms-proxy-init -voms Gluex:/Gluex -valid 72:0 voms-proxy-info -all globusrun -a -r gluex.phys.uconn.edu
The voms-proxy-info command should report that a valid proxy is configured in your shell environment, and that its lifetime has begun to tick down from the maximum you requested. The last command should return the message, "GRAM Authentication test successful". If not, stop here to diagnose the problem or seek help. Once this works, you are ready to proceed.
Preparing a job for submission on the OSG consists of assembling four basic components:
- a condor submit file (download this example)
- a user job script (download this example)
- the user program (download this example)
- auxiliary files needed by the program (download these examples)
They are listed in this order because the submit file needs to know about the other three, the user job script needs to know about the program and auxiliary files, the program only needs to know about the auxiliary files, and the auxiliary files have no dependencies, except possibly one another. Browse the examples to see how they point to one another.
This example is all ready to go. Just create a work directory where the jobs logs will be saved as the jobs complete, cd into the directory, and download the above example files into that directory. Before you can submit the job, it needs to know where your proxy is stored. Look above at the output from voms-proxy-info for the line that starts "path :". Cut and paste the filepath that follows into the exam1.sub file, at the line that starts "x509userproxy=", overwriting the previous path. Now you are ready to submit the job.
The jobs will now get pushed out, one-by-one, to the Gluex front-end to be scheduled for execution on the grid. The following command is useful for monitoring their progress.
You can also look in the exam1.log file which records the various events throughout the lifetime of the job, from original submission to scheduling, to execution, to completion. At submission time you will notice new stub files exam1_NNN.stdout and exam1_NNN.stderr with zero length have appeared in the job directory. As the jobs complete, these files will be filled with the standard output and standard error logs from each of the runs. The example submits 50 jobs by default. You can modify this number to suit your needs by changing the "queue 50" line at the end of exam1.sub to whatever you wish. The job status reported by "condor_q -globus" will transition from UNSUBMITTED to PENDING to RUNNING to DONE as the job progresses. You may also find it amusing to watch the overall statistics of Gluex jobs executing on the grid during the lifetime of your job. When your job completes, look for the following new files on the SRM.
where <N>=0..49 for a batch of 50 jobs. If someone else has already beat you to it and these files were already present on the SRM when you submitted, your jobs will discover this when they start and will return with messages in their logs reporting that the results already exist. In this case, you can either erase the existing files from SRM using "srmrm" if you know they are scratch copies, or you can modify the OFFSET argument in exam1.sub (second argument in the line that starts "arguments =", 0 by default) to start the filenames at a higher index <N> in a zone which is not already occupied on SRM. If you decide you want to kill your jobs before all of them have completed, you can issue the command "condor_rm JJJ" where JJJ is the job number prefix that is reported by "condor_q". There are many other condor commands and options, but this is enough to get started.
Building and Running your Own Jobs
This is only the beginning. The next thing you will want to do is to create your own custom executable and send it out with your job. Here you have two options. The first is recommended for someone who does not want to deal with the complexity of building executables for multiple unknown running environments. In such a case, you should create a new branch in svn for your dana plugin or other executable sources, and send a message to the offline mailing list asking that this branch be incorporated into the gridmake build tree. In response, you will receive an updated xml file similar to exam1.xml, with a new rule defined that will automatically invoke your code to make targets with the special name that identifies the type of objects your code makes, eg. 2kaonX_rest_<N>.hddm or whatever. When that is ready to go, you can try it by typing "./gridmake -f myexam1.xml 2kaonX_rest_0.hddm" in a work directory on your own machine and verify that it works, then incorporate this command into your version of exam1.bash and submit it to the grid.
Please observe this general rule: test your executables and scripts on your local machine before shipping them out to the grid. Having 5000 jobs hanging and crashing out in Gridland can waste a lot of your time in recovery and cleanup, and bother other users and site admins as well. The key feature of gridmake is that it provides the same execution environment when it runs locally as it does when it runs on a grid worker node. An ounce of prevention...
The second option is to build your execution binary and shared libraries on your local machine, tar them up, and ship them out with your jobs by listing them in the transfer_input_files list in your submit file. You can still rely on gridmake to set up the basic GlueX offline software environment in which your job will run, such that the geometry and calibration databases, etc. are present and configured in the runtime environment. To do this, edit exam1.xml and add a new make rule of your own that creates a new make target file name template, and invokes your own executable to make it. Provided that you invoke your program within gridmake, all of the standard environment variables (HALLD_HOME, HDDS_HOME, etc.) will automatically be defined and point to a valid local install visible on the worker node.
Of course you are free to avoid gridmake altogether. In this case you will want to tar together a sufficient set of database files, shared libraries, and binaries to run your executables, and hope that the system environment is similar enough to the one used for the build that they will run. Gridmake was invented because this proved difficult to do in practice. There are very different installations that are running at different sites on the grid. The gfortran compiler is present on the worker nodes at very few sites, for example, and many workers do not even have the gcc/g++ compilers installed. On the other hand, most of the workers are running a 64-bit linux kernel of some vintage, so some degree of binary compatibility exists. If you decide to go this way, you are better off shipping binaries and hoping for the best, rather than trying to build from original sources in the local worker node environment.
Which ever way you decide to go, please let me know so I can provide advice and assistance.
This document and the provision of a grid execution framework for GlueX software are the result of support received from the National Science Foundation under Physics at the Information Frontier grant 0653536.