GlueX Offline Meeting, March 19, 2014
GlueX Offline Software Meeting
Wednesday, March 19, 2014
1:30 pm EDT
JLab: CEBAF Center F326/327
- 1 Agenda
- 2 Communication Information
- 3 Minutes
- 3.1 Review of Minutes from the Last Meeting
- 3.2 Porting EventStore to GlueX
- 3.3 Data Challenge Meeting Report, March 14
- 3.4 Fix to Compression of REST-Formatted Files
- 3.5 New Random Number Seed Scheme
- 3.6 Non-Reproducible Reconstruction
- 3.7 Data Challenge Status: are we ready to freeze?
- 3.8 Other Data Challenge Issues
- 3.9 Next Data Challenge Meeting
- Review of minutes from the last meeting: all
- Eventstore (Sean)
- Data Challenge Meeting Report, March 14 (Mark)
- Fix to Compression of REST-Formatted Files (Richard)
- New Random Number Seed Scheme (Richard/David)
- Non-Reproducible Reconstruction (All)
- Data Challenge Status: are we ready to freeze? (All)
- Mantis Bug Tracker Review
- Review of recent repository activity
- Videoconferencing platform for next meeting? (All)
- ESNet: 8542554
You can view the computer desktop in the meeting room at JLab via the web.
- Go to http://esnet.readytalk.com
- In the "join a meeting" box enter the Hall D code: 1833622
- Fill in the participant registration form.
To connect by telephone:
- US and Canada: (866)740-1260 (toll free)
- International: (303)248-0285 (toll call) or look up toll-free number at http://www.readytalk.com/intl
- enter access code followed by the # sign: 1833622#
Talks can be deposited in the directory
/group/halld/www/halldweb/html/talks/2014-1Q on the JLab CUE. This directory is accessible from the web at https://halldweb.jlab.org/talks/2014-1Q/ .
- CMU: Paul Mattione
- FSU: Aristeidis Tsaris
- IU: Kei Moriya
- JLab: Mark Ito (chair), David Lawrence, Dmitry Romanov, Simon Taylor, Beni Zihlmann
- MIT: Justin Stevens
- NU: Sean Dobbs
- UConn: Alex Barnes
Review of Minutes from the Last Meeting
We looked at the [GlueX Offline Meeting, February 5, 2014#Minutes|minutes of the February 5th meeting]. In particular reviewed the features of Gagik's Tagged File System in preparation for contrast and compare with EventStore.
Porting EventStore to GlueX
Sean led us through what he has learned/reminded-himself-of about EventStore. See his slides for details. They covered:
- [Introduction to] EventStore
- Example Invocation
- Architecture for CLEO and GlueX
- CLEO Data Model and EventStore
- Data Life Cycle [stages of event sample refinement]
- Metadata [run selection criteria]
- Roadmap [for implementation]
- CLEO Data Lifecycle [as example]
- Other Things [version tags, metadata criteria for GlueX]
- Can It Work with the Grid?
There are still some questions about the mechanism used to index events within a data file and how much the code can be ported to GlueX. Use with the grid would likely take some development. We encouraged Sean to proceed with his plan to implement a simple example to understand possible issues in detail.
Data Challenge Meeting Report, March 14
We took only a cursory look at the minutes from the last data challenge meeting as the rest of the agenda items for this meeting were to deal with the open issues from that meeting.
Fix to Compression of REST-Formatted Files
Richard Jones checked in a change to fix the "short file" problem where writing to the output REST file would stop in the middle of reconstruction even though event processing continued on. There was indeed a bug in the xstream library when compression was enabled, as suspected originally by David and Simon.
Mark ran 5,000 jobs of 10,000 events each and saw no short files. (Note that this means none of the jobs crashed for any other reason in addition.) We declared this problem fixed.
New Random Number Seed Scheme
Richard also checked in a change to the random number seed procedure. Now in hdgeant, each incoming event is checked to see if it contains seed information. If so, hdgeant's random number generated is reseeded with data derived from the incoming information. This is essentially the plan Curtis proposed at the last data challenge meeting. It gives reproducible results even in a multi-threaded program where individual events may go to different threads in repeated runs on the same input data.
Richard also made some modifications to a similar scheme that David had implemented in mcsmear, this time with mcsmear cuing off of information output from hdgeant.
We noted that we did not see a change come in to modify bggen to write seed information in its output events. That is necessary to fully implement Curtis's proposal. We will ask Richard about this.
We decided that we can run this data challenge without this change to bggen. David pointed out that since hdgeant is single-threaded at present, and event-by-event seed is not necessary to insure reproducibility.
There has been a lot of work on this problem since the last data challenge meeting, but no fixes have been found. The incidence appears to be less than a part per mil event-wise and the differences seen between repeated analysis are slight. These differences are also limited to individual events; for the most part subsequent events in the file give identical results. We decided that this problem should not impact analysis of resulting data and we can live with it for this data challenge.
Data Challenge Status: are we ready to freeze?
We confirmed that the deadline for changes/improvements is noon Thursday, March 20, but at present there is nothing stopping us from freezing on the current version of the trunk.
Paul reminded us to make sure that the configuration files we are using at the various sites are those checked into trunk/data_challenge/02/conditions, and described at
He recently fixed the beam photon energy range to be consistent with what agreed on and updated the particle.dat file used by bggen to its most recent version.
Other Data Challenge Issues
Sean asked about two aspects that we have not quite nailed down.
- Monitoring: what are we planning to do to monitor data integrity?
- Data Distribution: how do we plan to distribute the resulting data to data analyzers?
Although ideally these issues would have been dealt with by now, we noted that we could start with processing as long as we address them early in the course of production. We agreed to discuss them further at the...
Next Data Challenge Meeting
We will meet on Friday at 11:00 am as the current schedule calls for. Note that by that the time code and configuration files will have been frozen so in principle production may have started at the sites. This meeting will be a chance to evaluate how things are going.