HDGeant4 Meeting, October 23, 2018
Tuesday, October 23, 2018
JLab: CEBAF Center, A110
- JLab: Thomas Britton, Mark Ito (chair), Simon Taylor, Beni Zihlmann
- W&M: Justin Stevens
- UConn: Richard Jones
There is a recording of this meeting on the BlueJeans site. Use your JLab credentials.
Review of minutes from last time
We went over the minutes from October 9. Discussion is folded into the comments on the appropriate issues below.
Issues on GitHub
We went through the open issues for HDGeant4 on the GitHub site.
- geant3 crash. Beni reported this one. Thomas has not been able to reproduce the problem, though he (Thomas) may be using a different version of HDDS. We agreed to close this one and allow Beni and Thomas to investigate off-issue (so to speak).
- Photon energy scale. Sean reported a shift in two-photon invariant mass from the nominal. Richard is aware of the issue.
- More FDC issues?.
- Beni reminded us that he sees a loss in hits-along-the-track at the wire based stage and he suspects that the loss is in both HDG3 and HDG4.
- Thomas will produce distributions of hit truth times and compare them between HDG3 and 4 and post them.
- DIRC photon propagation time across boundaries. This is still an issue. Richard is aware of it.
- many overlaps in the GlueX geometry. Still an issue. Richard warned that there are a lot of false positives in the overlap report from Geant4. He likened finding real problems to the proverbial search for a needle in a haystack. It is on his list.
Container-intended software builds and Oasis
Richard pointed out the the GlueX software distribution in Oasis is insufficient to support builds. Mark acknowledged that only files needed to run is provided. Mark will look at augmenting the distributed files, at least for the most current release.
Submit node MySQL connection problem
Thomas reported a problem on the OSG submit node, scosg16.jlab.org, that blocks connections from the submit node to our database server, hallddb.jlab.org. All connections to all databases are refused when the system in the error state, but only those from scosg16. Other nodes around that lab can connect; no problem. So far the error state has only been induced by running one of MCwrapper's Python scripts that monitors OSG job progress by updating a database. A CCPR has been submitted and several of the IT staff have had a look at the problem. They continue to work on it.
Small Files and the OSG
Richard gave us a heads-up. At some point we will run into a "small file problem": repatriating multiple small files from disparate OSG nodes. We may need to design and implement a system-like solution.