Interactive GRID Job Monitoring
  Shih-Chieh Hsu
  Elliot Lipeles
  Conrad Steenberg
  Frank Wuerthwein
Based on:

Clarens logo
Hosted by:

SourceForge Logo


When a user submits a GRID job a large number of things can go wrong even after the job starts. For example, resources such as data handling systems and databases can be broken or inaccessible or the user can make simple errors in the assembly of the job. In order to quickly and efficiently detect and diagnose problems, users need to be able to access the running job before completion. Simple task such as being able to read log files in real time, or see when the job is consuming CPU can be of great value. The JobMon system implements a secure and authenticated method for users to access running GRID jobs. It is an generalization of the tools originally constructed for the CDF experiment. The general design of the JobMon system is driven by two factors major factors: locating and establishing communication with the job, and authenticating this connection. Authentication and security are particularly important because the user will be able to execute commands and access data on the job's computing resources.

System Design

There are three separate places where code is run in an individual transaction. There is the persistent Clarens web-service generally located at the execution site. The job runs a jobmond process which runs the communication and executes the commands on the worker node. The jobmond persists for as long as the job is executing. Finally there is the client code, which a users executes when they want to interact with the remote job, this persists only until the transaction has completed.



  • Installation [html]
  • Usage [html]
  • Download
    • One could use source forge cvs download or VDT-1.3.7 installation
    • Here is source tarball for JobMon Server code tar.gz
    • Here is source tarball for JobMon Client code tar.gz


Nov 2, 2005

OSG Readiness Plan JobMon ReadinessPlan

Oct, 2005

VDT 1.3.7 Integration

July 5th, 2005

Now hosted SourceForge

Modified on Tue Jul 5 10:56:07 CDT 2005 by E. Lipeles