Start of topic | Skip to actions
Dynamic OS Cluster Environment with XenOverviewA Dynamic OS Cluster Environment gives greater flexibility with the types of jobs users can execute on a cluster. Using Xen, virtual machines can be created with an operating system different to the host machine. This allows virtual machines running different OS to be booted on computational nodes to meet the OS requirements of jobs submitted to the the node. For example, the software package Athena, which will only run on Scientific Linux 3, can be run on any cluster regardless of the clusters operating system. When an job requiring Athena is detected by a computational node (host machine), it can start a virtual machine running Scientific Linux 3 for the job to be executed on. After the job is finished, the output is sent back to the host machine, and then back to the user via the job manager and the virtual machine shutdown, restoring the node to its original state.PrototypeHow It WorksA rough guide to how the Dynamic OS Cluster Environment prototype works.Domain U File systemDomain U's file system is built from two block devices, a read only static file system image, and a writable ram disk image. The ram disk is mounted at / and contains /linuxrc, binaries (/bin) and libraries (/FS/lib) for the programs mount, and ln (?), a symlink /lib to /FS/lib and the writable sections of the file system such as /var and /tmp. The static file system contains everything except /boot, /dev, /etc, /proc, /root, /var and /tmp. On startup of Domain U the ram disk is loaded into memory the script /linuxrc is executed. /linuxrc mounts the static file system at /FS and removes /bin using /FS/bin/rm and symlinks /bin to /FS/bin using /FS/bin/ln. This now gives us the full working file system, where the read only static file system image can be shared between many virtual machines, and all writable changes go to the ram disk in memory.Execution of Commands on Domain UTo submit a job to the virtual machine, the user submits the name of the script they want to execute as the first parameter to domurun.sh. The script domurun.sh balloons down the memory of Domain 0 (host machine) and starts up Domain U (virtual machine). domurun.sh then waits till Domain U has halted before continuing execution. The user to execute the command as ($CUSTOMUSER), the command to execute ($CUSTOMCMD), the job id ($CUSTOMJOBID), and the directory to execute the command in ($CUSTOMWORKDIR) are passed to Domain U as kernel parameters by domurun.sh. $CUSTOMWORKDIR is equal to the current working directory ($PWD). This directory must be accessible by BOTH Domain 0 and Domain U at the same path. An NFS mount, mounted at the same path on both Domain 0 and Domain U works perfect for this. The variables passed in on the kernel are read by the init script cmdexec; a script placed in init.d and symlinked to be executed last in the default run level on Domain U. If no command is given cmdexec has no effect, and Domain U will boot normally to a login prompt. If a command is found by cmdexec, the command is executed in the directory $CUSTOMWORKDIR, with stdout and stderr outputted to $CUSTOMWORKDIR/stdout.$CUSTOMJOBID, and $CUSTOMWORKDIR/stderr.$CUSTOMJOBID respectively. Once the command has finished execution cmdexec halts the virtual machine. When domurun.sh has detected Domain U has halted, it prints $CUSTOMWORKDIR/stdout.$CUSTOMJOBID to stdout, and $CUSTOMWORKDIR/stderr.$CUSTOMJOBID to stderr on Domain 0 and cleans up all temporary files. domurun.sh is also responsible for updating a heartbeat file, which is created at $CUSTOMWORKDIR/vm.hb.$CUSTOMJOBID. Before cmdexec begins executing the given command, it executes hbmon.sh as a background process. hbmon.sh periodically checks to see if the byte size of the heartbeat file has changed. If no change is detected hbmon.sh halts the virtual machine. The heartbeat insures that if domurun.sh is terminated by the job manager, or the user before its execution has finished the virtual machine will be halted as well.Installing and Using the PrototypeQuick Install
Example Install
Running scripts/commands on Domain UTo run a script/command on Domain U simply run domurun.sh with the script/command as the first parameter. eg. domurun.sh testscript.sh will start up Domain U and execute testcript.sh. testscript.sh has to be on a directory accessible on both Domain 0 and Domain U at the same path, eg. an NFS mount mounted at the same position. The default directory for domurun.sh to use is the current working directory ($PWD), so domurun.sh should only be called from a directory accessible by Domain 0 and Domain U at the same path. If domurun.sh is called with no parameters, it will start the virtual machine and cmdexec will be skipped so that the virtual machine will boot normally to a login prompt.Troubleshooting
| |