EPP Grid - GQSched (Grid Quick Scheduler)


Start of topic | Skip to actions

GQSched (Grid Quick Scheduler)

Overview

The GQSched utility was designed as a simple yet powerful prototype Data Grid job scheduler and dispatcher, but has developed into a usable tool. It was designed with a parameter sweeping functionality similar to that of the Nimrod/G tool, with the additional ability to sweep over Grid files. Grid files can be either physical or logical files stored on remote resources. Physical files are accessible via Grid protocols (eg. GSIFTP, GASS). Logical files are registered within replica catalogues such as Globus RC (Replica Catalog) or SRB (Storage Resource Broker) and are more complex than simple parameter variables.

Since it's inception, in mid 2002, it has progressed to a stable and usable tool, enabling quick command-line access to Grid processing and data resources. The primary advantage of this tool is the use of very low level, common, Grid services. While many tools are moving toward using 3rd generation Grid services, 2nd generation services are still the most common and most likely to be available. By using only a minimal subset of low level tools we can ensure the most amount of compatibility with existing Grid resources.

The most prominent feature of GQSched is you can take your existing batch processing scripts and quickly turn them into a script the can be submitted across a Grid. For computer users familiar with scripting and command line tools this is a rapid application deployment solution.

Features

  • Submit a simple script for your job.
  • Directives are embedded into your script.
  • Process over multiple resource continuously monitored for availability and system load. (GRAM authentication and MDS/GRIS/Glue information)
  • Supports Globus 2.x and LCG2 Compute Elements resources.
  • Access to remote resource queues independently.
  • Automatic checking and creating of Grid proxy.
  • Access to SRB for processing of logical files and collections.
  • SRB file selection using user defined meta-data conditions.
  • Transparent SRB file creation with user defined meta-data attributes.
  • Your SRB environment is automatically transferred to remote hosts.
  • Access to Globus 2 Replica Catalog for processing of logical files.
  • Access to GSIFTP for processing of remote file sets.
  • Staging of local and remote files to remote host (using simple directives).
  • Staging back of output files to local host or remote data storage.
  • Automatic job creation when new input files are made available (by specifying local or SRB input file patterns or directories).
  • Automatic scheduling when new job scripts are made available (by specifying a new job script spool directory).
  • Job parameters are always accessible via environment variables for ease of access within scripts and applications.
  • Data file attributes, such as size and user defined meta-data, are also accessible via environment variables in all your scripts.
  • Automatic retry of all Globus operations.
  • Automatic retry and reschedule of failed jobs.
  • Ability to rerun and reschedule specific jobs.
  • Return of job standard output and standard error to local host.
  • Remote processing occurs in separate and unique directories to prevent filename clashes for input and output.
  • Automatic cleanup of remote jobs for friendly usage of remote systems.
  • Continuous reporting on jobs awaiting resources or jobs awaiting completion.
  • Data Grid aware scheduling of jobs. Jobs are schedule on the most free resources nearest the most appropriate Grid accessible input files (or replicas).
  • Optional pre-processing task before each job.
  • Optional post-processing task after each job.
  • Optional post-failure processing task after jobs have failed all retries.
  • Optional wrapper script to help control of job on remote resources.
  • Extensive usage documentation.

Requirements

GQSched must be run on a host with command line access to Globus 2 client tools. Depending on usage, SRB and GSIncFTP command line tools may also be required. This utility was created using GNU C++ 3.x and the pthread library.

Download

Right click and save either the GQSched binary (Linux i86) or download the tar ball and "make" it yourself:

Running

Type "gqsched" for basic help, or "gqsched -help" for more extensive information. You will need a 2 files to get started.
  1. Resource file: File containing a list of all Globus 2 and LCG2 CE resources you have access to.
  2. Job Script file: Shell script defining the task to be executed.
The task will be split into multiple jobs to be distributed over the Grid resources.

An example resource file might be:

brecca-2.vpac.org/jobmanager-pbs queue=dque ; GQS_maxjobcount=10 ;\
       maxTime<=43200 ; GQS_gatekeeperstage=jobmanager-fork

lem.ph.unimelb.edu.au/jobmanager-pbs queue=defaultq

alfred.hpc.unimelb.edu.au/jobmanager-pbs queue=serial ; maxTime<=2880

An example job script file might be:

#!/bin/csh -f
#:PARAM MYFILE GRIDFILE srb:/anusf/home/ljw563/inputdir/*.mdst
echo "Resource: "$REMOTE_RESOURCE
echo "Host: "`hostname`
echo "Date: "`date`
#:STAGEIN $MYFILE
#:STAGEIN myconfig.dat
echo Processing Job $JOBID on File $MYFILE
myexec -input $MYFILE_localfile -output temp.out
#:STAGEOUT temp.out srb:/anusf/home/ljw563/outputdir/temp.out.$JOBID
echo "Date finished: "`date`

The above script starts a job for each file matching *.mdst in the SRB collection /anusf/home/ljw563/inputdir (see #:PARAM directive). The file is staged in to the remote processing host (see #:STAGEIN directive) along with the locally residing file myconfig.dat . The executable myexec is started after stage in finishes with the specified arguments of $MYFILE (local location after stage in) and an output file temp.out store on the remote file system. The output file temp.out is then staged out to the SRB collection /anusf/home/ljw563/outputdir as the file name temp.out. with the job number appended ($JOBID).

Type 'gqsched -help' for further information.

Support

Email: Lyle Winton at Lyle AT Winton.id.au

key Log In Revision:  r13 - 24 Feb 2008 - LyleWinton
Authorised by:  Geoff Taylor (G.Taylor @ physics.unimelb.edu.au)
Maintained using:  This site is powered by the TWiki collaboration platform
Copyright © 2000-2009 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.