Difference between revisions of "Farm Job Tracking Database"
From GlueXWiki
(→Job Table) |
(→Job Table) |
||
Line 48: | Line 48: | ||
*** resource usage from standard output file | *** resource usage from standard output file | ||
*** times from "jobstat" command, disappeared after job finished | *** times from "jobstat" command, disappeared after job finished | ||
+ | ** Job ID itself captured at submit time | ||
* a particular run/file may have more than one job if it had to be resubmitted | * a particular run/file may have more than one job if it had to be resubmitted | ||
Revision as of 10:26, 28 March 2014
Contents
Main Table
Listing of all runs/files that are in the plan and their status.
Description
mysql> describe dc_02; +------------------+--------------+------+-----+-------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +------------------+--------------+------+-----+-------------------+-----------------------------+ | run | int(11) | NO | PRI | 0 | | | file | mediumint(9) | NO | PRI | 0 | | | submitted | tinyint(4) | NO | | 0 | | | output | tinyint(4) | NO | | 0 | | | jput_submitted | tinyint(4) | NO | | 0 | | | silo | tinyint(4) | NO | | 0 | | | jcache_submitted | tinyint(4) | NO | | 0 | | | cache | tinyint(4) | NO | | 0 | | | mod_time | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | +------------------+--------------+------+-----+-------------------+-----------------------------+ 9 rows in set (0.00 sec)
Example
mysql> select run, file, submitted, output, jput_submitted, silo, mod_time from dc_02 limit 10; +------+---------+-----------+--------+----------------+------+---------------------+ | run | file | submitted | output | jput_submitted | silo | mod_time | +------+---------+-----------+--------+----------------+------+---------------------+ | 9001 | 2000019 | 1 | 0 | 0 | 0 | 2014-03-22 02:23:32 | | 9001 | 2000065 | 1 | 0 | 0 | 0 | 2014-03-22 02:25:02 | | 9001 | 2000062 | 1 | 0 | 0 | 0 | 2014-03-22 02:24:56 | | 9001 | 2000010 | 1 | 0 | 0 | 0 | 2014-03-22 02:23:14 | | 9001 | 2000017 | 1 | 0 | 0 | 0 | 2014-03-22 02:23:28 | | 9001 | 2000022 | 1 | 0 | 0 | 0 | 2014-03-22 02:23:38 | | 9001 | 2000059 | 1 | 0 | 0 | 0 | 2014-03-22 02:24:50 | | 9001 | 2000088 | 1 | 0 | 0 | 0 | 2014-03-22 02:25:47 | | 9001 | 2000025 | 1 | 0 | 0 | 0 | 2014-03-22 02:23:43 | | 9001 | 2000057 | 1 | 0 | 0 | 0 | 2014-03-22 02:24:46 | +------+---------+-----------+--------+----------------+------+---------------------+ 10 rows in set (0.00 sec)
Job Table
Listing of all jobs submitted.
- information captured from JLab batch farm system (Auger) database via JSON web service
- previously:
- resource usage from standard output file
- times from "jobstat" command, disappeared after job finished
- Job ID itself captured at submit time
- previously:
- a particular run/file may have more than one job if it had to be resubmitted
Description
mysql> describe dc_02Job; +-----------------+---------------+------+-----+-------------------+------------------------ -----+ | Field | Type | Null | Key | Default | Extra | +-----------------+---------------+------+-----+-------------------+------------------------ -----+ | id | int(11) | NO | PRI | NULL | auto_increment | | run | int(11) | YES | | NULL | | | file | int(11) | YES | | NULL | | | jobId | int(11) | YES | | NULL | | | timeChange | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMES TAMP | | username | varchar(64) | YES | | NULL | | | project | varchar(64) | YES | | NULL | | | name | varchar(64) | YES | | NULL | | | queue | varchar(64) | YES | | NULL | | | hostname | varchar(64) | YES | | NULL | | | nodeTags | varchar(64) | YES | | NULL | | | coresRequested | int(11) | YES | | NULL | | | memoryRequested | int(11) | YES | | NULL | | | status | varchar(64) | YES | | NULL | | | exitCode | int(11) | YES | | NULL | | | result | varchar(64) | YES | | NULL | | | timeSubmitted | datetime | YES | | NULL | | | timeDependency | datetime | YES | | NULL | | | timePending | datetime | YES | | NULL | | | timeStagingIn | datetime | YES | | NULL | | | timeActive | datetime | YES | | NULL | | | timeStagingOut | datetime | YES | | NULL | | | timeComplete | datetime | YES | | NULL | | | walltime | varchar(8) | YES | | NULL | | | cput | varchar(8) | YES | | NULL | | | mem | varchar(64) | YES | | NULL | | | vmem | varchar(64) | YES | | NULL | | | script | varchar(1024) | YES | | NULL | | | files | varchar(1024) | YES | | NULL | | | error | varchar(1024) | YES | | NULL | | +-----------------+---------------+------+-----+-------------------+------------------------ -----+ 30 rows in set (0.00 sec)
Examples
Completed Jobs
mysql> select run, file, jobId, hostname, status, result, timeSubmitted, timeActive, timeComplete, cput, mem, vmem from dc_02Job limit 10; +------+---------+---------+------------+--------+---------+---------------------+---------------------+---------------------+----------+----------+---------- -+ | run | file | jobId | hostname | status | result | timeSubmitted | timeActive | timeComplete | cput | mem | vmem | +------+---------+---------+------------+--------+---------+---------------------+---------------------+---------------------+----------+----------+-----------+ | 9002 | 2001429 | 6355302 | qcd12s0220 | DONE | SUCCESS | 2014-03-24 11:55:20 | 2014-03-24 12:34:39 | 2014-03-25 13:44:18 | 25:10:17 | 698604kb | 1016384kb | | 9001 | 2004958 | 6372009 | qcd12s0423 | DONE | SUCCESS | 2014-03-24 18:37:13 | 2014-03-26 17:17:36 | 2014-03-27 14:14:14 | 20:57:50 | 721724kb | 1081992kb | | 9003 | 2000891 | 6332786 | farm10016 | DONE | SUCCESS | 2014-03-23 13:54:58 | 2014-03-24 10:23:56 | 2014-03-25 08:53:01 | 22:35:26 | 810652kb | 1147524kb | | 9002 | 2001722 | 6357568 | farm09021 | DONE | SUCCESS | 2014-03-24 13:17:32 | 2014-03-25 12:31:28 | 2014-03-26 10:50:00 | 22:16:05 | 700876kb | 1016460kb | | 9001 | 2004651 | 6371701 | farm10017 | DONE | SUCCESS | 2014-03-24 18:26:46 | 2014-03-26 17:08:22 | 2014-03-27 11:16:58 | 18:08:36 | 729572kb | 1083076kb | | 9002 | 2000415 | 6306483 | farm09022 | DONE | SUCCESS | 2014-03-22 15:14:09 | 2014-03-22 15:45:15 | 2014-03-23 15:12:51 | 23:23:01 | 773828kb | 1081984kb | | 9002 | 2001974 | 6357824 | qcd12s0727 | DONE | SUCCESS | 2014-03-24 13:26:17 | 2014-03-25 12:38:18 | 2014-03-26 13:58:53 | 25:20:49 | 751380kb | 1081984kb | | 9001 | 2000692 | 6330148 | farm13014 | DONE | SUCCESS | 2014-03-23 12:22:03 | 2014-03-23 12:43:30 | 2014-03-24 02:14:33 | 13:31:49 | 858756kb | 1213052kb | | 9001 | 2000423 | 6305951 | farm11015 | DONE | SUCCESS | 2014-03-22 14:47:11 | 2014-03-22 15:00:19 | 2014-03-23 08:17:58 | 17:17:39 | 795508kb | 1151836kb | | 9003 | 2000956 | 6332891 | farm13019 | DONE | SUCCESS | 2014-03-23 13:57:14 | 2014-03-24 10:28:13 | 2014-03-25 03:20:37 | 16:58:54 | 806416kb | 1147520kb | +------+---------+---------+------------+--------+---------+---------------------+---------------------+---------------------+----------+----------+-----------+ 10 rows in set (0.00 sec)
Running Jobs
mysql> select run, file, jobId, hostname, status, result, timeSubmitted, timeActive, timeComplete, cput, mem, vmem from dc_02Job where status = 'active'limit 10; +------+---------+---------+------------+--------+--------+---------------------+---------------------+--------------+------+------+------+ | run | file | jobId | hostname | status | result | timeSubmitted | timeActive | timeComplete | cput | mem | vmem | +------+---------+---------+------------+--------+--------+---------------------+---------------------+--------------+------+------+------+ | 9001 | 2007001 | 6418792 | farm12009 | ACTIVE | NULL | 2014-03-26 23:17:11 | 2014-03-27 18:50:16 | NULL | NULL | NULL | NULL | | 9001 | 2007002 | 6418793 | farm10015 | ACTIVE | NULL | 2014-03-26 23:18:22 | 2014-03-27 18:50:17 | NULL | NULL | NULL | NULL | | 9001 | 2007003 | 6418794 | farm09019 | ACTIVE | NULL | 2014-03-26 23:18:24 | 2014-03-27 18:50:17 | NULL | NULL | NULL | NULL | | 9001 | 2007004 | 6418795 | qcd12s0707 | ACTIVE | NULL | 2014-03-26 23:18:26 | 2014-03-27 18:50:25 | NULL | NULL | NULL | NULL | | 9001 | 2007005 | 6418796 | farm13023 | ACTIVE | NULL | 2014-03-26 23:18:28 | 2014-03-27 18:50:27 | NULL | NULL | NULL | NULL | | 9001 | 2007006 | 6418797 | farm13002 | ACTIVE | NULL | 2014-03-26 23:18:30 | 2014-03-27 18:50:26 | NULL | NULL | NULL | NULL | | 9001 | 2007007 | 6418798 | farm12002 | ACTIVE | NULL | 2014-03-26 23:18:32 | 2014-03-27 18:50:27 | NULL | NULL | NULL | NULL | | 9001 | 2007008 | 6418799 | farm10024 | ACTIVE | NULL | 2014-03-26 23:18:34 | 2014-03-27 18:50:26 | NULL | NULL | NULL | NULL | | 9001 | 2007009 | 6418800 | farm10023 | ACTIVE | NULL | 2014-03-26 23:18:37 | 2014-03-27 18:50:28 | NULL | NULL | NULL | NULL | | 9001 | 2007010 | 6418801 | farm10016 | ACTIVE | NULL | 2014-03-26 23:18:39 | 2014-03-27 18:50:26 | NULL | NULL | NULL | NULL | +------+---------+---------+------------+--------+--------+---------------------+---------------------+--------------+------+------+------+ 10 rows in set (0.02 sec)
Jobs in the Queue
mysql> select run, file, jobId, hostname, status, result, timeSubmitted, timeActive, timeComplete, cput, mem, vmem from dc_02Job where status = 'pending' limit 10; +------+---------+---------+----------+---------+--------+---------------------+------------+--------------+------+------+------+ | run | file | jobId | hostname | status | result | timeSubmitted | timeActive | timeComplete | cput | mem | vmem | +------+---------+---------+----------+---------+--------+---------------------+------------+--------------+------+------+------+ | 9001 | 2007769 | 6419560 | NULL | PENDING | NULL | 2014-03-26 23:44:54 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007770 | 6419561 | NULL | PENDING | NULL | 2014-03-26 23:44:56 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007771 | 6419562 | NULL | PENDING | NULL | 2014-03-26 23:44:59 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007772 | 6419563 | NULL | PENDING | NULL | 2014-03-26 23:45:01 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007773 | 6419564 | NULL | PENDING | NULL | 2014-03-26 23:45:03 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007774 | 6419565 | NULL | PENDING | NULL | 2014-03-26 23:45:05 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007775 | 6419566 | NULL | PENDING | NULL | 2014-03-26 23:45:07 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007776 | 6419567 | NULL | PENDING | NULL | 2014-03-26 23:45:09 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007777 | 6419568 | NULL | PENDING | NULL | 2014-03-26 23:45:11 | NULL | NULL | NULL | NULL | NULL | | 9001 | 2007778 | 6419569 | NULL | PENDING | NULL | 2014-03-26 23:45:13 | NULL | NULL | NULL | NULL | NULL | +------+---------+---------+----------+---------+--------+---------------------+------------+--------------+------+------+------+ 10 rows in set (0.02 sec)