Difference between revisions of "Transition to JANA2"

From GlueXWiki
Jump to: navigation, search
(b1pi test)
m (Monitoring launch)
 
(25 intermediate revisions by the same user not shown)
Line 13: Line 13:
 
=== Discussion Points ===
 
=== Discussion Points ===
  
* How do we treat halld_sim and hdgeant4 developments for previous run periods? Can we patch old halld_recon versions?
+
* How do we treat halld_sim and hdgeant4 developments for previous run periods? Can we patch old halld_recon versions? <br> A: Need fine-grained version set selection in MCWrapper
  
 
== halld_recon ==
 
== halld_recon ==
Line 70: Line 70:
 
                   b1 -> omega pi+
 
                   b1 -> omega pi+
 
                         omega -> pi+ pi- pi0
 
                         omega -> pi+ pi- pi0
 +
 
Usage:
 
Usage:
  
 
   b1pi_test.sh [-n <number of events>] [-t <number of threads>] [-r <run number>]\
 
   b1pi_test.sh [-n <number of events>] [-t <number of threads>] [-r <run number>]\
 
     [-v <vertex string>] [-d <b1pi_test script directory>]
 
     [-v <vertex string>] [-d <b1pi_test script directory>]
 +
 +
The script performs these steps:
 +
# genr8: Event generator, part of the halld_sim repository, output: b1_pi.ascii
 +
# genr8_2_hddm: Convert output of genr8 to hddm file, in halld_sim repository, output: b1_pi.hddm
 +
# hdgeant4: Running simulation with geant4, output: hdgeant4.hddm
 +
# mcsmear: Adding detector effects to simulation, part of halld_sim repo, output: hdgeant_smeared.hddm
 +
# hd_root with danarest plugin: Runnning reconstruction, part of halld_recon, output: dana_rest.hddm
 +
# hd_root with b1pi_hists, monitoring_hists plugins: Analysis, part of halld_recon, output: hd_root.root
 +
# root mk_pics.C: create plots and save them as pdf and gif
  
 
Example:
 
Example:
  
   export B1PI_TEST_DIR /group/halld/Software/hd_utilities/b1pi_test/
+
  source /group/halld/Software/build_scripts/gluex_env_boot_jlab.sh
   export SEED 123
+
  gxenv /group/halld/www/halldweb/html/halld_versions/version_5.21.1_jana2.xml
   JANA_CALIB_CONTEXT "variation=mc"
+
   export B1PI_TEST_DIR=/group/halld/Software/hd_utilities/b1pi_test/
   b1pi_test.sh -n 100000 -r 30480
+
   export SEED=123
 +
   export JANA_CALIB_CONTEXT="variation=mc"
 +
   $B1PI_TEST_DIR/b1pi_test.sh -n 10000 -r 30480 -4
 +
 
 +
In the following plots, we compare few results from JANA1 (left) and JANA2 (right):
 +
 
 +
[[File:Jana1_jana2_reconstructed_photons.png|600px]]<br>
 +
[[File:Jana1_jana2_reconstructed_protons.png|600px]]<br>
 +
[[File:Jana1_jana2_reconstructed_X2000.png|600px]]
 +
 
 +
The remaining differences were caused by not using exactly the same halld_sim version and the change in the Get() function behavior in JANA2.
 +
 
 +
== Monitoring launch ==
 +
 
 +
As a next step, we would like to run a full monitoring launch with JANA2. We prepared a [https://halldweb.jlab.org/talks/2024/jana2/jana_offmon.config configuration file] with the necessary changes to the [https://jeffersonlab.github.io/JANA2/#/jana1to2/jana1-to-jana2?id=parameter-changes parameters]. It can be processed with data from the GlueX-II 2023-01 run period, e.g.
 +
hd_root --loadconfigs jana_offmon.config /cache/halld/RunPeriod-2023-01/rawdata/Run121102/hd_rawdata_121102_018.evio
 +
 
 +
We observe a several different failure modes:
 +
* Plugin not compatible with jana2, failure already during loading: BCAL_TDC_Timing, pi0fcaltofskim
 +
* Crash while running: TOF_online,TOF_TDC_shift,BCAL_inv_mass,HLDetectorTiming,FCAL_invmass,trackeff_missing,CDC_dedx
 +
* Segfault at the end: BCAL_online,PS_flux,BCAL_Eff,BCAL_attenlength_gainratio
 +
* Infinite loop at event 0 in combination with monitoring_hists: fa125_itrig
 +
* Infinite loop or crash: TAGM_TW
 +
 
 +
All other plugins appear to run, tested with 4 threads and 10k events:
 +
PLUGINS occupancy_online,highlevel_online,danarest,monitoring_hists,TAGH_online,TAGM_online,TAGM_clusters,BEAM_online,CDC_online,CDC_Efficiency,FCAL_online,FDC_online,FDC_Efficiency,ST_online_lowlevel,lowlevel_online,PS_online,PSC_online,PSPair_online,TPOL_online,BCAL_Hadronic_Eff,FCAL_Hadronic_Eff,p2pi_hists,p3pi_hists,ppi0gamma_hists,TRIG_online,CDC_drift,RF_online,CDC_expert_2,L1_online,FCAL_TimingOffsets_Primex,p4pi_hists,p2k_hists,CDC_TimeToDistance,TOF_calib,CDC_amp,TPOL_tree,evio_writer,randomtrigger_skim,syncskim,imaging,TAGH_timewalk,TrackingPulls,lumi_mon,PS_timing,ST_Tresolution,dirc_hists,dirc_reactions,DIRC_online
 +
 
 +
For this jobs, we compare the resident memory footprint between version_5.21.0.xml (jana1) and version_5.21.1_jana2.xml :<br>
 +
[[File:Jana2 memory benchmark.PNG|600px]]
 +
 
 +
A similar picture appears with only monitoring_hists:
 +
hd_root -PPLUGINS=monitoring_hists -PMONITOR:MEMORY_EVENTS=10000 -Pjana:nevents=10000 -PNTHREADS=4 /cache/halld/RunPeriod-2023-01/rawdata/Run121102/hd_rawdata_121102_018.evio
 +
[[File:Jana2 memory benchmark mh.PNG|600px]]
 +
 
 +
== HOW-TO use gdb ==
 +
 
 +
gdb --args hd_root -PPLUGINS=HLDetectorTiming /cache/halld/RunPeriod-2023-01/rawdata/Run121102/hd_rawdata_121102_018.evio
 +
(gdb) catch throw (optional)
 +
(gdb) run
 +
(gdb) continue (many times)
 +
(gdb) bt
 +
#0  0x00007ffff48ad7c2 in __cxa_throw () from /lib64/libstdc++.so.6
 +
#1  0x000000000087a7e7 in JFactoryT<DEventRFBunch>* JEvent::GetSingle<DEventRFBunch>(DEventRFBunch const*&, char const*, bool) const ()
 +
#2  0x00007fffe3e6d776 in JEventProcessor_HLDetectorTiming::Process(std::shared_ptr<JEvent const> const&) ()

Latest revision as of 10:12, 12 December 2024

Transition to JANA2

This page will document the transition of the GlueX software stack from JANA1 to JANA2.

Useful Links

Discussion Points

  • How do we treat halld_sim and hdgeant4 developments for previous run periods? Can we patch old halld_recon versions?
    A: Need fine-grained version set selection in MCWrapper

halld_recon

Prerequisite Tests

  • Objects: hd_dump
    • DTrackTimeBased
    • DTrackWireBased
    • DBCALShower
    • DVertex
    • DChargedTrack
  • Plugins: hd_root
    • monitoring_hists
hd_root -PPLUGINS=monitoring_hists /cache/halld/RunPeriod-2017-01/rawdata/Run030300/hd_rawdata_030300_000.evio
    • occupancy_online
hd_root -PPLUGINS=occupancy_online /cache/halld/RunPeriod-2017-01/rawdata/Run030300/hd_rawdata_030300_000.evio
    • danarest
hd_root -PPLUGINS=danarest /cache/halld/RunPeriod-2017-01/rawdata/Run030300/hd_rawdata_030300_000.evio
    • p2pi_hists, p3pi_hists: Output histograms can be checked with macros HistMacro_p2pi.C, HistMacro_p3pi.C
hd_root -PPLUGINS=p2pi_hists,p3pi_hists /cache/halld/RunPeriod-2017-01/rawdata/Run030300/hd_rawdata_030300_000.evio
hd_root --config=/group/halld/www/halldweb/html/talks/2024/jana2/jana_test.config /cache/halld/RunPeriod-2017-01/rawdata/Run030300/hd_rawdata_030300_000.evio
    • mcthrown_tree (for MC)

Additional Tests

Benchmarking Results

JANA2 benchmark.png


hdgeant4

hdgeant4 models the measurements (detector hits) produced by a generator. It is steered by a control.in file in the local directory and converts the generated hddm file. When control.in and input file are present in the local directory, simply execute

hdgeant4

halld_sim

hdgeant

The same input and control.in that was used for hdgeant4 can also be used for hdgeant(3):

hdgeant

mcsmear

mcsmear models the detector resolution to match the MC simulation results with actual measurements. It can use this input file and can be executed by

mcsmear gen_amp_030730_000_geant4.hddm

b1pi test

The b1pi test runs the full simulation and reconstruction chain for

gamma p -> p X
             X -> b1 pi-
                  b1 -> omega pi+
                        omega -> pi+ pi- pi0

Usage:

 b1pi_test.sh [-n <number of events>] [-t <number of threads>] [-r <run number>]\
   [-v <vertex string>] [-d <b1pi_test script directory>]

The script performs these steps:

  1. genr8: Event generator, part of the halld_sim repository, output: b1_pi.ascii
  2. genr8_2_hddm: Convert output of genr8 to hddm file, in halld_sim repository, output: b1_pi.hddm
  3. hdgeant4: Running simulation with geant4, output: hdgeant4.hddm
  4. mcsmear: Adding detector effects to simulation, part of halld_sim repo, output: hdgeant_smeared.hddm
  5. hd_root with danarest plugin: Runnning reconstruction, part of halld_recon, output: dana_rest.hddm
  6. hd_root with b1pi_hists, monitoring_hists plugins: Analysis, part of halld_recon, output: hd_root.root
  7. root mk_pics.C: create plots and save them as pdf and gif

Example:

 source /group/halld/Software/build_scripts/gluex_env_boot_jlab.sh
 gxenv /group/halld/www/halldweb/html/halld_versions/version_5.21.1_jana2.xml
 export B1PI_TEST_DIR=/group/halld/Software/hd_utilities/b1pi_test/
 export SEED=123
 export JANA_CALIB_CONTEXT="variation=mc"
 $B1PI_TEST_DIR/b1pi_test.sh -n 10000 -r 30480 -4

In the following plots, we compare few results from JANA1 (left) and JANA2 (right):

Jana1 jana2 reconstructed photons.png
Jana1 jana2 reconstructed protons.png
Jana1 jana2 reconstructed X2000.png

The remaining differences were caused by not using exactly the same halld_sim version and the change in the Get() function behavior in JANA2.

Monitoring launch

As a next step, we would like to run a full monitoring launch with JANA2. We prepared a configuration file with the necessary changes to the parameters. It can be processed with data from the GlueX-II 2023-01 run period, e.g.

hd_root --loadconfigs jana_offmon.config /cache/halld/RunPeriod-2023-01/rawdata/Run121102/hd_rawdata_121102_018.evio

We observe a several different failure modes:

  • Plugin not compatible with jana2, failure already during loading: BCAL_TDC_Timing, pi0fcaltofskim
  • Crash while running: TOF_online,TOF_TDC_shift,BCAL_inv_mass,HLDetectorTiming,FCAL_invmass,trackeff_missing,CDC_dedx
  • Segfault at the end: BCAL_online,PS_flux,BCAL_Eff,BCAL_attenlength_gainratio
  • Infinite loop at event 0 in combination with monitoring_hists: fa125_itrig
  • Infinite loop or crash: TAGM_TW

All other plugins appear to run, tested with 4 threads and 10k events:

PLUGINS occupancy_online,highlevel_online,danarest,monitoring_hists,TAGH_online,TAGM_online,TAGM_clusters,BEAM_online,CDC_online,CDC_Efficiency,FCAL_online,FDC_online,FDC_Efficiency,ST_online_lowlevel,lowlevel_online,PS_online,PSC_online,PSPair_online,TPOL_online,BCAL_Hadronic_Eff,FCAL_Hadronic_Eff,p2pi_hists,p3pi_hists,ppi0gamma_hists,TRIG_online,CDC_drift,RF_online,CDC_expert_2,L1_online,FCAL_TimingOffsets_Primex,p4pi_hists,p2k_hists,CDC_TimeToDistance,TOF_calib,CDC_amp,TPOL_tree,evio_writer,randomtrigger_skim,syncskim,imaging,TAGH_timewalk,TrackingPulls,lumi_mon,PS_timing,ST_Tresolution,dirc_hists,dirc_reactions,DIRC_online

For this jobs, we compare the resident memory footprint between version_5.21.0.xml (jana1) and version_5.21.1_jana2.xml :
Jana2 memory benchmark.PNG

A similar picture appears with only monitoring_hists:

hd_root -PPLUGINS=monitoring_hists -PMONITOR:MEMORY_EVENTS=10000 -Pjana:nevents=10000 -PNTHREADS=4 /cache/halld/RunPeriod-2023-01/rawdata/Run121102/hd_rawdata_121102_018.evio

Jana2 memory benchmark mh.PNG

HOW-TO use gdb

gdb --args hd_root -PPLUGINS=HLDetectorTiming /cache/halld/RunPeriod-2023-01/rawdata/Run121102/hd_rawdata_121102_018.evio
(gdb) catch throw (optional)
(gdb) run
(gdb) continue (many times)
(gdb) bt
#0  0x00007ffff48ad7c2 in __cxa_throw () from /lib64/libstdc++.so.6
#1  0x000000000087a7e7 in JFactoryT<DEventRFBunch>* JEvent::GetSingle<DEventRFBunch>(DEventRFBunch const*&, char const*, bool) const ()
#2  0x00007fffe3e6d776 in JEventProcessor_HLDetectorTiming::Process(std::shared_ptr<JEvent const> const&) ()