GlueX Offline Meeting, August 3, 2016

From GlueXWiki
Revision as of 12:30, 17 August 2016 by Marki (Talk | contribs) (sim1.1)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

GlueX Offline Software Meeting
Wednesday, August 3, 2016
1:30 pm EDT
JLab: CEBAF Center F326/327

Agenda

  1. Announcements
    1. Database Server upgrade: 5.5.50-MariaDB
    2. sim-recon-2.2.1 (Mark)
    3. Analysis TTree: Unused Tracks/Showers No Longer Saved (Paul)
  2. Review of minutes from the last meeting (all)
  3. Report from SciComp Meeting on July 21 (Mark)
  4. Report from Computing Round Table on August 2 (Mark)
  5. Analysis Launch (Alex)
  6. Next Monitoring Launch (Paul/Alex A./Sean)
  7. sim1.1 (Sean, Mark)
  8. ROOT 6 upgrade? (Mark)
  9. Review of recent pull requests (all)
  10. Review of recent discussion on the Gluex Software Help List.
  11. Action Item Review

Communication Information

Remote Connection

Slides

Talks can be deposited in the directory /group/halld/www/halldweb/html/talks/2016 on the JLab CUE. This directory is accessible from the web at https://halldweb.jlab.org/talks/2016/ .

Minutes

Present:

  • CMU: Naomi Jarvis, Curtis Meyer
  • IU: Matt Shepherd
  • JLab: Alexander Austregesilo, Amber Boehnlein, Mark Ito (chair), Paul Mattione, Nathan Sparks, Justin Stevens, Simon Taylor, Beni Zihlmann
  • MIT: Maria Patsyuk, Cristiano Fanelli
  • NU: Sean Dobbs
  • Regina: Tegan Beattie

You can view a recording of this meeting on the BlueJeans site.

Announcements

  1. Database Server upgrade: 5.5.50-MariaDB. The upgrade was complete successfully two weeks ago by Marty Wise of CNI.
  2. sim-recon-2.2.1. This sub-minor release fixes the double-free bug for the 2.2 branch. This is now being used by sim1.1.
  3. Analysis TTree: Unused Tracks/Showers No Longer Saved. By default, unused hypotheses are not longer saved in particle combos. The old behavior can be recovered by setting an argument to the tree-output-enabling function. This results in a 40% reduction in disk space for the recent Analysis Launch.

Review of minutes from the last meeting

We went over the minutes from the meeting on July 20.

  • Managing Plugins. We agreed that, at least in the short term, we should pursue options 2 and 3 from the list:
    • 2. Reduce the number of plugins that are built automatically. (Justin)
    • 3. Have finer-grained build targets for SCons so that parts of the build can be skipped depending on the needs of the user. (Sean)
  • Bad files in Lustre. Mark and Alex have reported a few more bad files resulting in lost data. Alex has seen files go bad since the Lustre upgrade.

Report from SciComp Meeting on July 21

Mark reviewed the agenda from the meeting.

  • We are getting 48 new farm nodes, with 36 cores each, and 190 TB of disk.
  • A software review is coming in November.
  • There may be an overlap between a planned down of the tape library and Hall D running.

At this SciComp meeting Mark brought up the idea of charging for memory usage in the fairshare algorithm. It is a scarce resource; often times cores go idle because a farm node cannot accommodate more jobs due to memory. There was some discussion of how this could be done.

With multi-threaded JANA jobs memory usage vis-a-vis CPU usage is efficient. Curtis remarked that we have put a lot of effort into optimizing resource usage. It was a huge commitment on our part and all groups should take that seriously across the Lab.

Report from Computing Round Table on August 2

Mark reported in the meeting held yesterday. Wes Bethel from LBNL described an ASCR workshop he organized that addressed data challenges faced by Office of Science experimental and observational science programs.

In the morning before the meeting, Wes met separately with each of the Halls. We presented brief overviews of our computing model and mentioned "pain points" that we are feeling. Mark, Justin, Alex, and Sean participated in the Hall D session.

Amber made some remarks:

  • She solicited input on speakers and topics that might be presented at these meetings. Four have been held so far. Amber can often come up with a name from outside the Lab to address a specific topic, and speakers from inside the Lab are sought as well.
  • Wes's slides are available on the Lab's Indico site.
  • Wes has noticed similarities in problems faced in programs across the Office of Science programs, yet the discussion at any one lab tends to emphasize differences.
  • She thinks that computing is a frontier area for the 12 GeV program. In this context, specifically in the areas of data management, computational science, and visualization, we should look to learn from others whenever possible.
  • Mark will announce future meetings to the offline group. Folks should feel free to participate when topics stimulate personal interest.

Analysis Launch

Alex described the Analysis Launch. It started last week and included contributions from 12 people. In all 46 channels were analyzed. The jobs are finishing now. There was some competition from sim1.1 otherwise things would have gone faster. In all there are 4 TB of ROOT trees and another 1 TB of ROOT histograms.

Alex led us through the Analysis Launch section (near bottom) of the launch analysis webpage. The link there shows various statistics and graphs of the jobs.

Next Monitoring Launch

Sean proposed that if the new tracking code is ready we should try another monitoring launch to test those and other improvements to the code. Simon thinks that the code is ready. We decided to try to do a launch this Friday.

On a related note, folks thought that the time is right for Mark to do another sim-recon release.

sim1.1

Mark reported on the status of sim1.1. 75% off the jobs and most of the resulting REST files have been written to tape.

Justin suggested running an analysis launch on the sim1.1 REST files. We endorsed the idea, but did not find a volunteer.

ROOT 6 Upgrade?

We discussed several points, but in the end agreed on the following course:

  • For now we keep ROOT 5 as the default version.
  • We maintain a parallel build of all packages against ROOT 6. Early adopters can try this out.
  • We run the next monitoring launch using the ROOT 6 version as a large scale test.
  • The assumption is that all of our public compiled code is compatible with both versions of ROOT. If that turns out not to be true, then the scheme breaks down.
  • Not all macros (they do not participate in the build procedure) will be compatible with both versions and so people running those will have to know what they are doing. Since the default remains ROOT 5, old macros will continue to work for the naive user.
  • Once we have established that using ROOT 6 in the monitoring launch is working, we will revisit the issue.

Review of recent pull requests

We reviewed the list of recent pull requests.

We mainly expressed shock and awe at the large number of commits in Simon's recent upgrade of the tracking code to include timing information from the FDC anode wires. Corresponding changes to the CCDB have been entered.

Review of recent discussion on the GlueX Software Help List

We looked over recent conversations.

Maria has got the ROOT visualization of the detector going, though there are some features left to be desired. Simon mentioned that the hdgeant++ (CERNLIB) programs has tools that are useful for finding overlaps and the like.