FA125 firmware check

From GlueXWiki
Revision as of 11:34, 1 February 2016 by Njarvis (Talk | contribs)

Jump to: navigation, search


Used the raw samples to emulated the fa125's calculated values and compared them with the fa125 output.
Looked at the first evio files from runs 3293 (2 weeks old) and 4062 (this week).


Number of discrepancies between firmware output and emulation output ('complete events' have Pulse and Raw data present)
Run total events complete events time q amplitude pedestal integral overflow count
3923 16582555 1160 1942 3547 15609 1
3923 minus 2 bad fadcs 15914850 378 1877 288 12798 1
4062 file 000 10249118 10247801 0 0 98 (70 early hits) 0 3 0
4062 file 001 10283167 10282756 3 0 68 (46 early hits) 0 2 1
4062 file 002 10262816 10259658 1 0 65 (41 early hits) 0 5 1
4062 file 003 10248768 10245940 2 0 58 (40 early hits) 1 4 3
4062 file 004 10251199 10249150 0 0 73 (53 early hits) 0 8 2


Many of the problems in 3923 were due to hardware faults in roc28 slots 5&6.

11770 of the 12798 difference in integral were due to faulty assignment of the overflow bit. This has been fixed.

There were a few problems in the firmware which Cody described & fixed before run 4062. The remaining issues are not critical.


Differences in data/emulation from run 4062

The first 4 are firmware logic, & not necessarily a mistake (could be a mistake in the emulator); the remaining 3 are more weird.

1. Integral differences - these are from very late hits where there is a small peak at the end of the hit search window that only just clears the threshold crossing, followed by a larger peak a few samples later. The timing algorithm returns the time for the larger peak and the emulated integral is 0 because it is out of the window. [Cody to fix]

2. Amplitude differences - both firmware and emulation are starting the peak search from the threshold crossing sample but it should really start from the sample containing the leading edge time, since very occasionally the search will pick up a different peak (usually a small one before a larger one). [Cody and Naomi to fix]

3. Amplitude differences - early hits - 70 of the 98 differences are where the samples at the start of the window are over threshold, decrease for one sample and then rise again, ie. hitsample==20 && (adc[20]>=adc[21]) && (adc[21]<adc[22]) where adc[20] is the first sample in the hit search window. Emulator returns maxamp = adc[20]; firmware returns amp of the following peak. [Cody to fix]

4. Amplitude differences - later hits - 28 of the 98 differences seem to have no apparent cause, no association with roc/slot/channel, in most cases the max amp reported is larger than all the sample values [???]

5. Overflow count - the firmware is counting overflows from the hit sample - PG on, but the emulator counts them for the entire data window. Naomi will change the emulator to match the firmware. [Naomi to fix]

6. Missing pulse or WRD - pulse & Window raw data are separate objects in the evio eventloop, not linked yet. One out of sync pair causes the rest of the data for that trigger to be out of step.
[from WRD without straws, all digihits have straws, David to fix w association]

Decoder crashing

hd_root used to segfault when the window raw data contain fewer samples than expected. This has mostly been fixed. Which of the recent run files still make it crash? Will compile a list. I think they were in Run4127. Naomi found hd_rawdata_004062_001.evio to cause a segfault one time, but not to segfault many times.

njarvis  11014 27.0  0.4 664324 105956 pts/8   Sl+  14:30  24:36 hd_root hd_rawdata_004062_001.evio -PPLUGINS=CDC_em -o d4062_001.root

[njarvis@maria: /raid12/gluex/rawdata2/Run003923 ]> screen -r

===========================================================
#14 0x0000003e4700953f in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
#15 0x0000003e47009f0a in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2
#16 0x0000003e4700e0c0 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
#17 0x0000003e470148f5 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
#18 0x0000000000cd9d65 in jana::JApplication::Run (this=0x7ffc9e597ec0, proc=<value optimized out>, Nthreads=<value optimized out>) at src/JANA/JApplication.cc:1688
#19 0x000000000057dda7 in main (narg=3, argv=0x7ffc9e5984f8) at programs/Analysis/hd_root/hd_root.cc:45


fa125 object mismatches

ie when the nth digihit contains a CDCPulseData object for a hit channel differing from that for the nth windowrawdata object. Usually they match. Every now and then a CDCPulseData is missing from the start of an event and then the following subsequent objects are out of step for a while (the mismatches continue into following events and eventually stop). Sometimes there is a lone WRD object first and a lone CDCPulseData at the end, but sometimes the lone objects are not found and only the mismatched pairs are found.
Run 4062 Naomi sees mismatches using a hd_root plugin but Beni does not, using his independent analyzer.
Run 4127 all the CDC pulse hits are paired with FDC window data.


Not enough samples

eg eventnum 6520553 and 6520556 in Run003923. Naomi finds insufficient samples, Beni does not.

missing pulse data

4731 Beni found 2 pulse data words missing from 57 files


recent cosmics data

(4296 does not have CDC data)

4594 FDC params are 

FADC125_MODE         7
FADC125_W_OFFSET     430
FADC125_W_WIDTH      80
FADC125_IE           16
FADC125_NPEAK        1

FADC125_PG          4
FADC125_P1          4
FADC125_P2          4

FADC125_IBIT        4
FADC125_ABIT        0
FADC125_PBIT        3
004003 #files= 14 modes 6 (7)  Not sure that config file in RCDB is correct - FADC125_W_WIDTH=180 FADC125_IE=80 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=4 from Dec 2015  CDC readout
004039 #files= 14 modes 6 (7)  Not sure that config file in RCDB is correct - FADC125_W_WIDTH=180 FADC125_IE=80 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1                CDC readout 

004044 #files= 2  modes 6 (7)  DAQ params look ok.  FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1 from 8th Dec 2015                      CDC readout 
004062 #files= 21 modes 6  7   DAQ params look ok.  FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1 from 8th Dec 2015
004101 #files= 22 modes 6 (7)  DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1                                        CDC readout

004593 #files= 2  modes 6  7   DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1   7 Jan 2016                           CDC+FDC readout
004594 #files= 2  modes 6  7   DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1                                        CDC+FDC readout

004595 #files= 2  modes 3  4  **Short modes**      FA125 params as 4594 except mode #  TS_TRIG_HOLD=30 1 BLOCKLEVEL=20 BUFFERLEVEL=8                                        CDC+FDC readout
004597 #files= 57 modes 3  4  **Short modes**      FA125 params as 4594 except mode #  TS_TRIG_HOLD=30 1 BLOCKLEVEL=20 BUFFERLEVEL=8                                        CDC+FDC readout

004701 #files= 4  modes 6 (8)  DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1                                        CDC readout
004706 #files= 2  modes 6 (8)  DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1                                        CDC readout

004710 #files= 5  modes 6 8    DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1  
004711 #files= 8  modes 6 (8)  DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1                                        CDC readout  

004715 #files= 2  modes 6 8    DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1   FDC PBit changed to 2                CDC+FDC readout
004717 #files= 4  modes 6 8    DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1   FDC PBit=2                           CDC+FDC readout
004718 #files= 5  modes 6 8    DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1   FDC PBit=2                           CDC+FDC readout

004731 #files= 59 modes 6 8    DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1   FDC PBit=2 
004745 #files= 2  modes 6 8    DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1   FDC PBit=2 
004746 #files= 8  modes 6 8    DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1   FDC PBit=2 
004747 #files= 45 modes 6 8    DAQ params look ok   FADC125_W_WIDTH=200  FADC125_IE=200 TS_TRIG_HOLD=30 1 BLOCKLEVEL=1 BUFFERLEVEL=1   FDC PBit=2 
4039 hd_root segfault
#4  0x00007fde351937ef in TUnixSystem::DispatchSignals(ESignals) () from /home/gluex/root/v5-34-14_rhel6//lib/libCore.so
#5  <signal handler called>
#6  0x0000003c92b90048 in main_arena () from /lib64/libc.so.6
#7  0x000000000057d14b in MyProcessor::~MyProcessor (this=0x218e6d0, __in_chrg=<value optimized out>) at programs/Analysis/hd_root/MyProcessor.cc:51
#8  0x000000000057d419 in MyProcessor::~MyProcessor (this=0x218e6d0, __in_chrg=<value optimized out>) at programs/Analysis/hd_root/MyProcessor.cc:57
#9  0x0000000000581315 in main (narg=16, argv=0x7ffd8b606c18) at programs/Analysis/hd_root/hd_root.cc:47
===========================================================


The lines below might hint at the cause of the crash.
If they do not help you then please submit a bug report at
http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#6  0x0000003c92b90048 in main_arena () from /lib64/libc.so.6
#7  0x000000000057d14b in MyProcessor::~MyProcessor (this=0x218e6d0, __in_chrg=<value optimized out>) at programs/Analysis/hd_root/MyProcessor.cc:51
#8  0x000000000057d419 in MyProcessor::~MyProcessor (this=0x218e6d0, __in_chrg=<value optimized out>) at programs/Analysis/hd_root/MyProcessor.cc:57
#9  0x0000000000581315 in main (narg=16, argv=0x7ffd8b606c18) at programs/Analysis/hd_root/hd_root.cc:47
===========================================================


Segmentation fault (core dumped)
run 4044 hd_root ok
1472985 events, no missing pulses, only a few differences from emulated 
Total diffs: 91  time: 0  q: 0  amp: 79  ped: 0  integ: 12  oflow: 0
run 4062 

CDC mode 6, no FDC, hd_root ok

TS_TRIG_HOLD  30  1
BLOCKLEVEL   1
BUFFERLEVEL  1

https://halldweb.jlab.org/rcdb/files/info/2938

has NO unpaired pulse or WRD 
  24425108 events 
 
VERY FEW DIFFERENCES VS EMULATION from all 24425108 events (approx 10x as many hits?)
root [1] CDC->GetEntries()
(const Long64_t)1654
root [2] CDC->GetEntries("d_time")
(Long64_t)30
root [3] CDC->GetEntries("d_q")
(Long64_t)0
root [4] CDC->GetEntries("d_amp")
(Long64_t)1510
root [5] CDC->GetEntries("d_pedestal")
(Long64_t)11
root [6] CDC->GetEntries("d_integral")
(Long64_t)105
root [7] CDC->GetEntries("d_overflows")
(Long64_t)0



004101
JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c619f44
JANA ERROR>>...skipping to 0x0x7fca6c61aa20  (discarding 695 words)
JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c5ed444
JANA ERROR>>...skipping to 0x0x7fca6c5ede04  (discarding 624 words)
004711 hd_root crash
JANA ERROR>>
JANA ERROR>>Stack trace:
JANA ERROR>>
JANA ERROR>>   jana::JException::getStackTrace(bool, unsigned long)
JANA ERROR>>   jana::JException::JException(std::string const&)
JANA ERROR>>   JEventSource_EVIO::ParseF1TDCBank(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
JANA ERROR>>   JEventSource_EVIO::ParseJLabModuleData(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
JANA ERROR>>   JEventSource_EVIO::ParseEVIOEvent(evio::evioDOMTree*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
JANA ERROR>>   JEventSource_EVIO::ParseEvents(JEventSource_EVIO::ObjList*)
JANA ERROR>>   JEventSource_EVIO::GetObjects(jana::JEvent&, jana::JFactory_base*)
JANA ERROR>>   jerror_t jana::JEvent::GetObjects<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, jana::JFactory_base*)
JANA ERROR>>   jana::JFactory<DCDCDigiHit>* jana::JEventLoop::GetFromFactory<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, jana::JEventLoop::data_source_t&, bool)
JANA ERROR>>   jana::JFactory<DCDCDigiHit>* jana::JEventLoop::Get<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, bool)
JANA ERROR>>   JEventProcessor_CDC_em::evnt(jana::JEventLoop*, unsigned long)
JANA ERROR>>   jana::JEventLoop::OneEvent()
JANA ERROR>>   jana::JEventLoop::Loop()
JANA ERROR>>   LaunchThread(void*)
JANA ERROR>>   LaunchThread(void*)
JANA ERROR>>   LaunchThread(void*)
JANA ERROR>>
JANA ERROR>>


004718 #files= 5  modes 6 8  hd_root ok  3774539 events 
BLOCKLEVEL   1
BUFFERLEVEL  1

contains FDC data.  lots of unpaired FDC WRD

One unpaired CDC WRD
unpaired WRD eventnum 683052 roc 26 slot 4 chan 17 trig 16804908  ***    
This is from straw 2523 ring 23 straw 177

Lots of small differences in time and pedestal so maybe thresholds were not the usual ones.
Emulator:    
    const Int_t HIT_THRES = 115;   //110 for run 3923, 115 for run 4623
    const Int_t HIGH_THRESHOLD = 100;
    const Int_t LOW_THRESHOLD = 25;
RCDB trigger file:
FADC125_TH          100
FADC125_TL          25

root [14] CDC->GetEntries()
(const Long64_t)3161822
root [15] CDC->GetEntries("d_time")
(Long64_t)2732838
root [16] CDC->GetEntries("d_q")
(Long64_t)146990
root [17] CDC->GetEntries("d_amp")
(Long64_t)147049
root [18] CDC->GetEntries("d_pedestal")
(Long64_t)2554271
root [19] CDC->GetEntries("d_integral")
(Long64_t)1073256
root [20] CDC->GetEntries("d_overflows")
(Long64_t)0



4747
JANA ERROR>>Unknown module type (15) iptr=0x0x7f89ac529f20
JANA ERROR>>...skipping to 0x0x7f89ac52ae1c  (discarding 959 words)
JANA ERROR>>Unknown module type (15) iptr=0x0x7f89ac584430
JANA ERROR>>...skipping to 0x0x7f89ac584df0  (discarding 624 words)
Discrepancies between CDC firmware output and emulation output, evio ok = y means no crashes or other errors
Run evio ok hits diffs time q pedestal amplitude integral overflow count readout
4101 n 237994979 560 (+1077) total 0.0007% 61 0 13 473 (+989) 13 (+88) 1 CDC
4701 y 53322126 3782717 (+34) total 7.1% 3271649 185047 3062873 184311 (+20) 1301084 (+14) 0 CDC
4715 n 13973245 967778 (+10) total 6.9% 835994 44528 781132 44651 (+6) 326682 (+5) 0 CDC&FDC
4745 y 16760886 1079155 (+9) total 6.4% 921562 41083 867186 42364 (+4) 341519 (+5) 0 CDC&FDC
EVIO problems:
4101 6 instances of Unknown module type (15) & 3 instances of insufficient samples, in evio files 001 (86 samples), 012 (98 samples) and 020 (86 samples).

4715 1 instance of insufficient samples, in evio file 000  (& lots of unpaired FDC WRD which might be from disconnected ch) 

Insufficient samples error: The number of samples passed into the fa125_algos routine (86) is less than the minimum required by the parameters in use (171). Parameter WE (150) should be decreased to 65 or less.