GlueX Software Meeting, January 21, 2020

From GlueXWiki
Jump to: navigation, search

GlueX Software Meeting
Tuesday, January 21, 2020
3:30 pm EST
JLab: CEBAF Center A110
BlueJeans: 968 592 007

Agenda

  1. Announcements
    1. version_4.13.0.xml
    2. new version sets for recon launch consistency using halld_sim-4.12.0 et al.
  2. Review of Minutes from the Last Software Meeting (all)
  3. Review of Minutes from the Last HDGeant4 Meeting (all)
  4. Default CCDB server on farm: switch back to sqlite
  5. Review of recent issues and pull requests:
    1. halld_recon
    2. halld_sim
    3. CCDB
    4. RCDB
  6. Review of recent discussion on the GlueX Software Help List (all)
  7. Action Item Review (all)

Minutes

Present:

  • CMU : Naomi Jarvis
  • FSU : Sean Dobbs
  • JLab : Alex Austregesilo, Mark Ito (chair), Igal Jaegle, David Lawrence, Keigo Mizutani, Justin Stevens, Simon Taylor
  • ODU : Nilanga Wickramaarachchi

There is a recording of his meeting on the BlueJeans site. Use your JLab credentials to access it.

Announcements

Mark reminded us about the recent upgrade release for halld_sim and hdgeant4 (version_4.13.0.xml), and the corresponding recon launch releases.

Review of Minutes from the Last Software Meeting

We went over the minutes from the meeting on January 7. As part of the deployment of the new Lustre-based disk space, we will be moving our volatile partition to a home on a new system and doubling its size, from 60 TB to 120 TB. Mark will send out an announcement on how this will work. The change is scheduled for February 4.

Review of Minutes from the Last HDGeant4 Meeting

We went over the minutes from January 14. Peter Pauli posted a study under Issue #93: Calorimeter timing mismatch between g3 and g4. He sees a significant difference in timing between G3 and G4 for the slow kaon in his reaction, not unlike that seen in other channels.

Default CCDB server on farm: switch back to sqlite

Mark described the problem that led to the switch back to using SQLite files for farm jobs. The root problem is when a couple thousand jobs start on the farm at the same time. The MySQL servers get CPU-bound and jobs have to wait on their constants (delays of an hour or so when things are bad). Under lesser loads, there is no problem. There are three paths being pursued to fix the problem:

  1. Marty Wise of the Computer Center has called a meeting for tomorrow to discuss adding more servers to hallddb-farm (a DNS alias for a combination of hallddb-a and hallddb-b.jlab.org).
  2. CCDB 2.0 has a new feature: creation of an intermediate table that does the join needed for calibration constant retrieval in advance. Dmitry Romanov and Mark are working on getting this version deployed.
  3. Mark has started looking at the idea David proposed some time ago of making a smaller database that services only a subset of run numbers. Mark succeeded in making a version of CCDB, valid for a single run, with an "assignments" table a factor of 20 smaller than nominal and about a quarter of the disk footprint. Performance improvements have not been measured. This effort is in its early stages.

We will likely settle on some combination of these approaches.

  • Alex pointed out that his launches use a only-used-for-launches-SQLite file and thus present no load to either the database servers or the SQLite mirrors on the work disk.
  • Justin noted that we could separate the load on the servers on a project-by-project basis: certain launches could use hallddb-farm while the individual user gets directed to the SQLite mirrors.

Review of recent issues and pull requests

  • halld_sim Pull Request #111 Ijaegle primex. Sean expressed concern about a generator, built by default, that requires execution of shell scripts with non-generic paths and a gcc version greater or equal to 5.0. Igal explained that the non-default-ness would not hamper builds of the master branch, but may not give a usable product for amateurs.
  • RCDB is_dirc_production queries. These do not work at present on the website. Dmitry Romanov has been contacted via email.

Review of recent discussion on the GlueX Software Help List

We reviewed the list without comment.

Action Item Review

  1. Fix is_dirc_production on the RCDB webpage (Dmitry)
  2. Get Igal's PrimEx generator working on the farm. (Igal, Mark)
  3. Fix the database server so that CCDB access is not an issue (Mark et al.)
  4. Announce the move to a new volatile Lustre system (Mark)