GlueX Software Meeting, October 30, 2018
GlueX Offline Software Meeting
Tuesday, October 30, 2018
3:00 pm EDT
JLab: CEBAF Center A110
BlueJeans: 968 592 007
- 1 Agenda
- 2 Communication Information
- 3 Minutes
- 3.1 Announcements
- 3.2 Review of minutes from the October 16 meeting
- 3.3 Monitoring launch ver 18 @ NERSC
- 3.4 What I learned at BRNMW 2018
- 3.5 Events lost due to vertex fit for Lambda
- 3.6 CCDB versions and SQLite file management
- 3.7 Review of Offline Work Packages
- 3.8 Review of recent issues and pull requests
- 3.9 Review of recent discussion on the GlueX Software Help List
- Review of minutes from the October 16 meeting (all)
- Monitoring launch ver 18 @ NERSC (David)
- What I learned at BRNMW 2018 (David)
- Events lost due to vertex fit for Lambda (Hao)
- CCDB versions and SQLite file management (all)
- Communication: write to software help when things are broken
- Finding CCDB SQLite backups: make links like "current", "yesterday", "last-week"
- Move backups to a more obvious place: Same place as the latest = $DIST = /group/halld/halldweb/www/html/dist ?
- Reorganize $DIST?
- Review of Offline Work Packages
- Review of recent issues and pull requests:
- Review of recent discussion on the GlueX Software Help List (all)
- Action Item Review (all)
- The BlueJeans meeting number is 968 592 007 .
- Join the Meeting via BlueJeans
Talks can be deposited in the directory
/group/halld/www/halldweb/html/talks/2018 on the JLab CUE. This directory is accessible from the web at https://halldweb.jlab.org/talks/2018/ .
- CMU: Hao Li, Reinhard Schumacher
- FSU: Sean Dobbs
- JLab: Ashley Ernst, Mark Ito (chair), David Lawrence, Simon Taylor, Beni Zihlmann
- W&M: Justin Stevens
There is a recording of this meeting on the BlueJeans site. Use your JLab credentials to access it.
- New release of halld_sim: version 3.6.0. Released October 22. Has changes from Colin Gleason for amplitude-based generators.
- New bug-fix release: halld_sim 3.1.1. A special release to get genEtaRegge going.
Review of minutes from the October 16 meeting
We reviewed the minutes.
- Computing Review. Curtis Meyer has circulated a list of slide titles he proposes for the Hall D presentation. David gave feedback. Now only adding contest remains.
Monitoring launch ver 18 @ NERSC
David reviewed his recent email.
- The monitoring launch at NERSC is mostly finished. There are a few jobs where the 271 thread finished successfully, but the 272nd won't. This seems to happen with particular input files. These jobs were tried on the gluons and ran without a problem. However, recall that they ran on KNL architecture at NERSC. David is requesting an interactive node at NERSC to do a test.
- Bryan Hess has found a few more places in the network where the MTU had room for increase. He is curious about how the new configuration will perform.
- The problem with missing cache files causing jobs to hang remains, likely due to deletion from the auto-cache cleaner. David will pursue solutions with Chris Larrieu.
- David is thinking about what to do next. Sean is in favor of a monitoring launch using 10 files per run dispersed throughout the Spring 2018 run. David told Bryan that he is thinking of firing up another launch at the end of next week.
- Beni asked about tape robot problems. David has not seen any lately, but heavy tape use during data taking may not have occurred recently.
- David and Mark will discuss the version XML file that David is using on NERSC.
What I learned at BRNMW 2018
David attended the Basic Research Needs for Microelectronics Workshop last week. Turns out he learned quite a lot, from topics like DNA memory technology to others like hotel meeting room door sound standards. Please see his slides for the details.
Events lost due to vertex fit for Lambda
Hao gave us a detailed look at the problem and proposed solutions to the issue where kinematic fits with vertex constraints do not converge. See his slides for the details.
- He described the channel where the problem seems especially severe, γp→ΛΛ̅p.
- He sees a deficit in the π− lab polar angle at around 40 degrees, where the yield drops to nearly zero.
- The fits fail when the ROOT linear algebra package encounters a singular matrix
- Two solutions were tried
- Multiply all matrix elements by 10,000 before testing the determinant, rescale results back down as appropriate.
- Use tolerance tuning: change the tolerance ROOT uses to call a determinant "zero". This tolerance is user settable.
- Both methods recover much of the deficit, with tolerance tuning showing better results.
- The fix has been put in a branch (kinFitter_debug) of halld_recon so other can try it.
From the discussion:
- Mark worried that the input covariance matrix for the measured kinematic quantities may have errors. that lead to this problem. Fixes that give fit convergence may mask a deeper problem.
- Reinhard has done some research on the subject. Other experiments have encountered similar problems. Even if there are problems with the covariance matrix formation, the proposed fixes should remain in place. They give us a more robust approach to kinematic fitting in general.
- Reinhard also pointed out that having others try out the fix would answer concerns about the effect on execution time and precision of results.
- Justin suggested trying the branch in the context of an analysis launch as a test.
CCDB versions and SQLite file management
Mark led a discussion prompted by a incident where the CCDB had errors in one of its calibration sets (a translation table). From the agenda:
- Communication: folks should write to the Software Help List when things are broken so others can avoid problems.
- Finding CCDB SQLite backups: to make it easier to find the backup versions we could create links like "ccdb_current", "ccdb_yesterday", and/or "ccdb_last-week".
- Move backups to a more obvious place: We could move the backup version into the same directory as the latest version, i. e. /group/halld/halldweb/www/html/dist, which is web-accessible.
- We could reorganize that directory. There is a lot of heterogeneous stuff there. Any re-org would break old webpage links and scripts however.
There was actually not a lot of discussion to lead. People with strong feelings should contact Mark.
Review of Offline Work Packages
Mark update the list of packages based on discussion from the last meeting. We took a look at revised list. Items now appear in two lists: "Analysis Software" and "Software Infrastructure". Only the assignment of volunteers remains to be done.
Review of recent issues and pull requests
- Issue #40: Sean broke the tracking again... Problem seen with ρ yield. Sean is looking at it.
- Issue #30: Trigger monitoring Sean has ideas to help identify LED triggers that get flagged as physics.
- Pull Request #43: fix NAN in return value of walk correction code. Make sure that square root take a positive value larger than zero Beni eliminated problem when the ADC amplitude is non-positive.
- Issue #16: mcsmear crash There is a problem when RCDB cannot find a file for two runs from Fall 2016. Sean filed this as an RCDB issue.
Review of recent discussion on the GlueX Software Help List
We went over the list. No discussion of note.