Online task list for 2011

From GlueXWiki
Revision as of 16:07, 21 March 2011 by Hovanes (Talk | contribs) (Plan Experiment Controls Display management)

Jump to: navigation, search

FY2011 Activity Schedule for Online Computing

The table below contains activities from the 12GeV project schedule in the Online Computing section which have work scheduled for FY2011. Detailed descriptions for the activities are kept at the bottom of the page and can be jumped to by clicking the short description in the table.

A breakdown of each activity into smaller tasks is maintained in an Excel file on the group disk here:


/group/halld/Individual-Schedules/Online_Computing





Activity Line Activity Name Man-weeks Names of people Comments
1532025 Plan Front-End Software 16.5 D. Lawrence, D. Abbott, B. Moffit
1532030 Plan DAQ Software Event Unblocking 9 D. Lawrence, D. Abbott, B. Moffit
1532030a Plan DAQ Software Scripts 6
1532030b Plan DAQ Run Control 5  ? + V. Gyurjyan Mostly Vardan
1532030c Plan DAQ Code Management 4
1532035 Plan Monitoring Framework 6  ? + V. Gyurjyan
1532035a Plan Monitoring Scalers 3
1532035b Plan Monitoring Histograms 4 D. Lawrence
1532035c Plan Remote Monitoring 3  ? + V. Gyurjyan
1532035d Plan Monitoring Hardware 4  ? + V. Gyurjyan
1532035f Plan Monitoring Processes 3  ? + V. Gyurjyan
1532035g Plan Monitoring Trigger 2 S. Somov + V. Gyurjyan
1532040 Plan Alarm Systems 6 Universities are expected to contribute more
1532045 Plan Archiving DAQ Configuration 3
1532045a Plan Archiving Run Info 5
1532045b Plan Archiving Controls 5
1532050 Plan Event Display 2 Universities are expected to contribute more
1532055 Plan Storage Management 11.3 Computer Center can help
1532060 Plan Controls Framework 4  ? + V. Gyurjyan
1532060a Plan Display Management 3
1532060b Plan Controls Backup/Restore 3
1532060c Plan Controls Magnet PS 4
1532060d Plan Controls HV 3
1532060f Plan Controls LV 4
1532060g Plan Controls Motors 4
1532060h Plan Controls Gas Systems 4
1532060j Plan Controls Temperature 4
1532060k Plan Controls Target 5
1532060n Plan Controls/DAQ interface 3  ? + V. Gyurjyan Mostly Vardan
1532065 Trigger Board Initialization 27 S. Somov + Electronics Group Two-year duration
1532035 Level 1 Verification 24 S. Somov + Electronics Group Two-year+ duration


Descriptions of Scheduled Activities

Plan Front-End Software

This will plan the Hall-D specific details of configuring and maintaining the software used in the front-end electronics in the hall. This includes where the CODA 3 configurations will be kept (disk resident XML files, database, ...?), and how we will revert to previous configurations or implement new ones.

This will also include plans for how the translation table needed for the offline will be interfaced with the online. Specifically, if the DAQ system detects module types automatically, how/where it will record these for use in parsing by both the online monitoring system and the offline systems.

Because the online systems can be very sensitive to configuration details, access to changes should probably limited to certain individuals. This plan should address how access to deployed system configurations will be limited to ensure integrity of the DAQ system.

Plan DAQ Software Event Unblocking

In production running the events will arrive entangled meaning all of the fragments of a single event will not appear in a single, contiguous memory section. Rather, the fragments will be mixed with fragments from other events and must be disentangled (or unblocked) to get a single event that may be analyzed. This will have to be done for monitoring as well as for L3 event filtering where the ability to save or discard a single event will be required.

This activity will provide a plan for how and where the events will be disentangled (EB, L3/monitoring farm, offline code base, ...?) This will include how the single events will be passed on to the CODA 3 Event Recorder for writing to disk/tape.

Estimates of CPU/memory/bandwidth resources required will be included so they may be added into the overall requirements for the Hall-D online computing resources.

Plan DAQ Software Scripts

Plan for general organization of scripts used as part of the Hall-D online systems. This will include the languages (python, perl, bash, ...) used for the command-line, batch-mode, cron-job, and GUI scripts.How the scripts will be maintained, and editing access restrictions will be included.

Plan DAQ Software Run Control

Plan for implementing the CODA 3 Run Control in the Hall-D online systems. This will include how the configuration will be maintained and how access to editing the configuration will be restricted. Ability to access Run Control from the counting house, the experimental hall, and via a remote, secure connection (for on-call maintenance) will be required. How that will be done while minimizing risk of disrupting operations will be addressed.

Plan DAQ Software Code Management

A plan for maintaining the online code base. This will include compiled programs, scripts, and configuration files that comprise the online software systems. This will include a choice of code management system and where it will be hosted. How this integrates with the offline software code-base which will very likely be used as a basis for the L3 event filter will be addressed.

Plan Monitoring Framework

The substantial number of independent monitoring subsystems developed for Hall D need to be coordinated and results presented to operators in a coherent way. Further, the monitoring system must interact with other independent systems such as the alarm system, archiving system, control system, etc. An overall strategy and architecture must be developed to ensure transparent interoperation among all these systems.

Plan Monitoring Scalers

Scaler information generated by the trigger, DAQ and other systems must be extracted from hardware, then monitored, analyzed and presented to operators and other automated monitoring systems as appropriate. Analyzed and raw scaler information must further be archived, and for critical scalers, archived in multiple places for redundancy. Scaler information in the data stream may need to be diverted into separate data streams for ease of access by the Offline group, and some scaler data may need to be entered into databases.

Finally, alarms need to be generated when automated analysis programs find problems in the scaler data.

Plan Monitoring Histograms

Events taken by the DAQ system must be continuously monitored for quality. The histogram monitoring system must extract a sample of events from the DAQ in real-time, analyze them, generate histograms, then present the information to operators and to other automated monitoring systems. The histograms must be archived periodically, and a reset mechanism must exist to clear histograms e.g. at the beginning of a new run. The system must also be able to read and analyze events from a file and operate independently of the system monitoring events in real-time.

Currently the RootSpy framework, developed within the Offline group but with the Online in mind, appears to be the best foundation for event histogram monitoring.

Finally, alarms need to be generated when automated analysis programs find problems in the histogram data.

Plan Monitoring Remote

A large fraction of detector hardware and some online software is being developed by collaboraters from other institutions, and they need to be able to monitor performance of their systems from off site. A system needs to be developed to allow them access to almost all information available to shift personnel, but in a way that satisfies JLab cyber security requirements. In some cases remote collaboraters may need to take control of DAQ and other systems to diagnose and repair problems in their systems.

Plan Monitoring Hardware Status

A large amount of detector hardware must be monitored for health during hall operations, beyond what is done by the EPICS-based control system. Hardware may inject status information periodically into the data stream, and processes must extract this information, archive it, and present it to operators. Other information will need to be proactively extracted fromt the hardware at appropariate times and in such a way as to not interfere with the high-speed DAQ system. And some information will only be extracted during special runs or calibrations procedures. And of course action must be taken or alarms must be generated when problems are detected.

This system must be designed to handle a large variety of disparate hardware while minimizing the amount of special programming required and must avoid compromising fast DAQ and other common operations.

Plan Monitoring Process Status

A large number of processes running on a large number of computers in the counting house need to be started, stopped and monitored during operations. These processes run under widely varying conditions. E.g. some need to be started at boot time and run continuously, others just during data taking, others just under special conditions.

The existance and health of all these processes needs to be continuously monitored in real time. Alarms need to be generated in case of failed processes, and if operator action is not required they can be restarted automatically. The monitoring system must be highly and easily configurable as the critical process list will change fairly often.

Plan Monitoring Trigger

The state-of-the-art high-speed Hall D trigger system must be monitored at all times for proper operation. This includes extraction and monitoring of scaler and data generated by the trigger hardware. This data must be analyzed and compared to expectations based on understanding of the physics involved and the trigger programming. Alarms must be generated and operators notified if problems are detected.

Plan Alarm Sys

Plan Archiving DAQ Configuration

Plan Archiving Run Info

Plan Archiving Controls

Plan Event Display

Form a plan for the online event display. This display is expected to be running continuously in the counting house to provide a quick visual of individual events being read in from the DAQ. It will also be used to replay events to monitor data integrity and to help debug the DAQ system. The graphics package used and what features the event display must have will be included in the plan. How the event display will interface with the DAQ system to get events will also be addressed.

Plan Storage Management

Plan for online data storage from the DAQ and online systems. This will include hardware systems (raid disks?) to hold the data and how it will be transferred to the Computer Center for permanent storage. Bandwidth requirements for the disk will be included as it may need to support quick replay analysis while still acquiring data. Slow controls values critical for data replay will also need to be copied into long term storage, possibly alongside the event-level data so if/how that is done should be addressed.

Plan Experiment Controls Framework

Plan Experiment Controls Display management

Slow controls system in Hall D will require a single Display Management framework to monitor and control different components in the Hall. A careful study needs to be done to identify the requirements for different components of the controls system and monitoring. Also we will need to study and test different existing display management systems which are easy to interface with EPICS to be able to select the best Display Management system matching Hall D needs.

  • Identify applications which need control and monitoring, and for each such application determine what screens they will require. Some of the systems may require large number of screen which will need a automated screen generation
  • Study a few of most eligible frameworks and evaluate their applicability to Hall D systems. It is highly desirable that the framewrok allows for automated generation of screens.
  • Make at least one prototype application utilizing the most favorable display management framework to identify the possible difficulties which we may encounter using it.
  • Create a work plan for the next two years for developing the control screen for Hall D controls.

Plan Experiment Controls Backup/Restore

Plan Experiment Controls Magnet PS

Plan Experiment Controls HV

Plan Experiment Controls LV

Plan Experiment Controls Motors

Plan Experiment Controls Gas Systems

Plan Experiment Controls Temperature

Plan Experiment Controls Target

Plan Experiment Controls Interface with DAQ

Plan for configuring the Hall-D specific configuration for interfacing the experiment controls with the DAQ. CODA 3 will include support for full experiment controls which will be leveraged by the Hall-D online system. This will include checks on various non-DAQ online systems by the DAQ system to help ensure data integrity. How the configurations will be maintained and access to their modification will be limited will be addressed in the plan.