HOWTO convert MC data to EVIO and replay it

From GlueXWiki
Jump to: navigation, search

Introduction

Simulated or "MC"(="Monte Carlo") data produced by Hall-D's sim-recon software is in HDDM format. The Data Acquisition system (DAQ) however will produce data in EVIO sometimes called "CODA" format. Development of certain software as well as some system checks require data in this "raw" form. In order for this development to occur prior to a fully functional DAQ system being implement and having beam in the hall, a software convertor was written to generate EVIO format from HDDM. This conversion is not a simple change in file format. The nature of the data itself is different:

  1. The raw data will be digitized values in units such as ADC counts or TDC counts
  2. The raw data will be indexed by DAQ system coordinates (crate, slot, channel)

The second item requires a Translation Table to convert from the natural indexing of a detector system into the crate,slot,channel indexing of the DAQ system. For example, the BCAL uses 4 indices for a readout channel: Module, layer, sector, and end. The Start Counter, however only requires one: Counter ID. This translation table is also needed for analyzing the raw data since the conversion from crate,slot,channel indexing into the detector specific indexing must be done.


Converting to EVIO

The conversion to EVIO makes use of the mc2coda library originally developed by Dave Abbott of the DAQ group. Elliott Wolin first implemented this for use in Hall-D. Those pieces were eventually wrapped into a single JANA plugin called rawevent by David Lawrence. The rawevent plugin is all that is needed to convert HDDM into EVIO. However, this is implemented as an optional build in the sim-recon scons build system. This means it is not built by default and you must build it explicitly by doing the following:

> cd $HALLD_HOME/src/programs/Utilities/plugins/rawevent
> scons -u install

note that you will need to have EVIO 4.3 or greater installed and your EVIOROOT environment variable pointing to it. The plugin will be installed into $HALLD_HOME/$BMS_OSNAME/plugins along with all of the other sim-recon plugins.

Use the plugin with an HDDM file as you would any other. The translation table will be read from the CCDB so make sure your JANA_CALIB_URL environment variable is set (e.g. mysql://ccdb_user@hallddb.jlab.org/ccdb). The simplest thing is to use hd_ana like this:

> hd_ana -PPLUGINS=rawevent hdgeant_smeared.hddm

This will produce a file named something like "rawevent_000002.evio".

Options

Several configuration parameters exist in the rawevent plugin that can be used to modify it's behavior. These are:

Parameter Name Description
TT:NO_CCDB Don't try getting translation table from CCDB and just look for file. Only useful if you want to force reading tt.xml. This is automatically set if you specify a different filename via the TT:XML_FILENAME parameter.
TT:XML_FILENAME Fallback filename of translation table XML file. If set to non-default, CCDB will not be checked.
RAWEVENT:FILEBASE Basename of output EVIO file (will have run number and ".evio" suffix appended)
RAWEVENT:TRIGTIME Trigger time of event in picoseconds (default is 3.2E7)
RAWEVENT:TMIN Minimum hit time in picoseconds (default is -1E5)
RAWEVENT:NOMC2CODA Set to non-zero to skip all calls to mc2coda library routines. This is only used for development of the rawevent plugin itself.
RAWEVENT:NOROOT Set to non-zero to skip generating and filling of ROOT histograms
RAWEVENT:DUMPHITS Set to non-zero to dump to screen info on every conversion. This is only used for development of the rawevent plugin itself.
RAWEVENT:DUMPMAP Dump map of translation table map to file (for debugging)
RAWEVENT:RUNNUMBER Override run number from input file with this one which will be written to every event in output file

Reading EVIO data

HDDM files already contain "calibrated" data. What is meant by this is that the values are in a form that has detector resolutions applied, but first level calibrations such as gain matching and timing offsets are assumed to have been applied. Units are basically physical units (GeV and ns) instead of ADC counts or TDC counts. The raw data, on the other hand, will need to have these first level calibrations applied. This is handled in JANA by creating two layers of lower-level objects from which the objects corresponding to the HDDM data are made. This allows the reconstruction to work with both HDDM files and raw data files since it starts with the same set of calibrated objects. The two layers of lower-level objects correspond to the two main tasks done while reading in EVIO data:

  1. Parse the EVIO data and create JANA objects
  2. Apply the translation table to create detector specific (uncalibrated) hit objects.

Reading of the EVIO formatted data is done using the DAQ plugin while application of the Translation Table in order to produce the DigiHit level objects is done using the TTab plugin. The DAQ plugin requires EVIO 4.3 or greater to be built.

The parsing task creates objects such as Df250PulseIntegral and DF1TDCHit. The translation table task will take these as inputs and produce objects such as DBCALDigiHit and DBCALTDCDigiHit. Next, the BCAL library in sim-recon will take these objects as inputs and create the calibrated hit objects such as DBCALHit and DBCALTDCHit. These are the objects the formal reconstruction starts with.

Note that the DBCALHit_factory in sim-recon is used to make DBCALHit objects only for EVIO data. Data in HDDM format has the DBCALHit and similar objects made in the event source itself so the factory algorithm is never called.

One may choose at this point to generate only the lowest level digitization objects (e.g. Df250PulseIntegral) or both the lowest level and the next-to-lowest-level detector-specific objects (e.g. DBCALDigiHit). For example, to dump a list of the objects in each event in the file "rawevent000002.evio", do the following:

> janadump -PPLUGINS=DAQ rawevent_000002.evio

The output will look something like this:

JANA >>Initializing plugin "/w/halld-scifs1a/home/davidl/builds/sim-recon/sim-recon/Linux_CentOS6-x86_64-clang3.2/plugins/DAQ.so" ...
JANA >>Opening source "rawevent_000002.evio" of type: EVIO  - Reads EVIO formatted data from file or ET system
JANA >>Launching threads .


================================================================
Event: 1
JANA >>
JANA >>Registered factories: (12 total)
JANA >>
JANA >>Name:             nrows:  tag:
JANA >>---------------- ------- --------------
JANA >>Df250PulseIntegral 183                                                         
JANA >>Df250TriggerTime   328                                                         
JANA >>Df250PulseTime     183                                                         
JANA >>Df125PulseIntegral 263                                                         
JANA >>Df125TriggerTime   194                                                         
JANA >>Df125PulseTime     263                                                         
JANA >>DF1TDCHit           60                                                         
JANA >>DF1TDCTriggerTime   98                                                         
JANA >>

< Hit return for the next event (P=prev. Q=quit) >

You'll notice that the program janadump was used here instead of hd_dump. Either will work in this case, but hd_dump will take significantly longer to start up due to long initialization tasks including reading in the magnetic field map. This is only possible with janadump because the lowest level objects are defined in the DAQ plugin. If you want to do anything with the next-to-lowest level objects (e.g. DBCALDigiHit) then hd_dump must be used since those objects are defined as part of DANA (i.e. they are Hall-D specific).

To get both the lowest and next-to-lowest level objects, use both the DAQ and TTab plugins. Again, make sure your JANA_CALIB_URL environment variable is set:

For running over these MC-derived EVIO files, the environmental variable JANA_CALIB_CONTEXT should be set to load a set of constants that correspond to the analysis of MC data. The default choice for this is JANA_CALIB_CONTEXT="variation=mc"

> hd_dump -PPLUGINS=DAQ,TTab rawevent_000002.evio

This will provide an output similar to this:

 <----- snip off first part ----->
JANA >>
JANA >>Registered factories: (136 total)
JANA >>
JANA >>Name:             nrows:  tag:
JANA >>---------------- ------- --------------
JANA >>Df250PulseIntegral 183                                                         
JANA >>Df250TriggerTime   328                                                         
JANA >>Df250PulseTime     183                                                         
JANA >>Df125PulseIntegral 263                                                         
JANA >>Df125TriggerTime   194                                                         
JANA >>Df125PulseTime     263                                                         
JANA >>DF1TDCHit           60                                                         
JANA >>DF1TDCTriggerTime   98                                                         
JANA >>DTranslationTable    1                                                         
JANA >>DBCALDigiHit        89                                                         
JANA >>DBCALTDCDigiHit     21                                                         
JANA >>DBCALHit            89                                                         
JANA >>DBCALTDCHit         21                                                         
JANA >>DBCALGeometry        1                                                         
JANA >>DBCALCluster         1    "SINGLE"                                             
JANA >>DBCALPoint           5                                                         
JANA >>DBCALUnifiedHit     89                                                         
JANA >>DFDCCathodeDigiHit 117                                                         
JANA >>DFDCHit            117                                                         
JANA >>DFDCCathodeCluster  88                                                         
JANA >>DFCALDigiHit        52                                                         
JANA >>DFCALHit            52                                                         
JANA >>DFCALCluster        15                                                         
JANA >>DFCALShower         15                                                         
JANA >>DFCALGeometry        1                                                         
JANA >>DCCALGeometry        1                                                         
JANA >>DTOFGeometry         1                                                         
JANA >>DTrackFitter         1                                                         
JANA >>DTrackFitter         1    "ALT1"                                               
JANA >>DTrackFitter         1    "Riemann"                                            
JANA >>DTrackHitSelector    1                                                         
JANA >>DTrackHitSelector    1    "ALT1"                                               
JANA >>DTrackHitSelector    1    "ALT2"                                               
JANA >>DTrackHitSelector    1    "THROWN"                                             
JANA >>DTrackFitter         1    "KalmanSIMD"                                         
JANA >>DTrackFitter         1    "KalmanSIMD_ALT1"                                    
JANA >>DEventWriterREST     1                                                         
JANA >>DParticleID          1                                                         
JANA >>DParticleID          1    "PID1"                                               
JANA >>DNeutralParticle    15                                                         
JANA >>DNeutralParticleHypo30esis                                                     
JANA >>DNeutralShower      15                                                         
JANA >>DVertex              1                                                         
JANA >>DEventRFBunch        1                                                         
JANA >>DDetectorMatches     1                                                         
JANA >>DReaction            1    "Thrown"                                             
JANA >>DAnalysisUtilities   1                                                         
JANA >>DEventRFBunch        1    "Combo"                                              
JANA >>DDetectorMatches     1    "Combo"                                              
JANA >>DEventWriterROOT     1                                                         
JANA >>DMCTrigger           1                                                         
JANA >>DL3Trigger           1                                                         
JANA >> 

< Hit return for the next event (P=prev. Q=quit) >

Options

Parameter Name Description
EVIO:AUTODETECT_MODULE_TYPES Try and guess the module type tag,num values for which there is no module map entry.
EVIO:DUMP_MODULE_MAP Write module map used to file when source is destroyed. n.b. If more than one input file is used, the map file will be overwritten!
EVIO:MAKE_DOM_TREE Set this to 0 to disable generation of EVIO DOM Tree and parsing of event. (for benchmarking/debugging)
EVIO:PARSE_EVIO_EVENTS Set this to 0 to disable parsing of event but still make the DOM tree, so long as MAKE_DOM_TREE isn't set to 0. (for benchmarking/debugging)
EVIO:BUFFER_SIZE Size in bytes to allocate for holding a single EVIO event.
EVIO:ET_STATION_NEVENTS Number of events to use if we have to create the ET station. Ignored if station already exists.
EVIO:ET_STATION_CREATE_BLOCKING Set this to 0 to create station in non-blocking mode (default is to create it in blocking mode). Ignored if station already exists.
EVIO:VERBOSE Set verbosity level for processing and debugging statements while parsing. 0=no debugging messages. 10=all messages
EVIO:EMULATE_PULSE_INTEGRAL_MODE If non-zero, and Df250WindowRawData objects exist in the event AND no Df250PulseIntegral objects exist, then use the waveform data to generate Df250PulseIntegral objects. Default is for this feature to be on. Set this to zero to disable it.
EVIO:EMULATE_SPARSIFICATION_THRESHOLD If EVIO:EMULATE_PULSE_INTEGRAL_MODE is on, then this is used to apply a cut on the non-pedestal-subtracted integral to determine if a Df250PulseIntegral is produced or not.
ET:TIMEOUT Set the timeout in seconds for each attempt at reading from ET system (repeated attempts will still be made indefinitely until program quits or the quit_on_et_timeout flag is set.
EVIO:MODTYPE_MAP_FILENAME Optional module type conversion map for use with files generated with the non-standard module types

Replaying MC data from ROCs using CODA

It is possible to load EVIO formatted event fragments onto physical ROCs and then read them out using the CODA DAQ system. The readout is done using a special readout list called mcROL. The EVIO file itself must be split into pieces, each containing event fragments for a specific ROC. The mcROL will copy the event fragment for its rocid onto a local RAM disk and then periodically trigger itself to send events to the Event Builder. One can run a program called softROCcontroller that will periodically synchronize the mcROL's via cMsg messages using the server on gluondb.jlab.org.

The original EVIO file is split using a program called evioSplitROC. The source for the various pieces needed can currently be found here:

Using this system is a highly specialized type of exercise. If you have need/desire to do this, please contact David Lawrence at x5567 or davidl@jlab.org.