Difference between revisions of "FA125 firmware check"

From GlueXWiki
Jump to: navigation, search
Line 155: Line 155:
 
JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c5ed444
 
JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c5ed444
 
JANA ERROR>>...skipping to 0x0x7fca6c5ede04  (discarding 624 words)
 
JANA ERROR>>...skipping to 0x0x7fca6c5ede04  (discarding 624 words)
 +
</pre>
 +
 +
<pre>
 +
004711 hd_root crash
 +
JANA ERROR>>
 +
JANA ERROR>>Stack trace:
 +
JANA ERROR>>
 +
JANA ERROR>>  jana::JException::getStackTrace(bool, unsigned long)
 +
JANA ERROR>>  jana::JException::JException(std::string const&)
 +
JANA ERROR>>  JEventSource_EVIO::ParseF1TDCBank(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
 +
JANA ERROR>>  JEventSource_EVIO::ParseJLabModuleData(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
 +
JANA ERROR>>  JEventSource_EVIO::ParseEVIOEvent(evio::evioDOMTree*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
 +
JANA ERROR>>  JEventSource_EVIO::ParseEvents(JEventSource_EVIO::ObjList*)
 +
JANA ERROR>>  JEventSource_EVIO::GetObjects(jana::JEvent&, jana::JFactory_base*)
 +
JANA ERROR>>  jerror_t jana::JEvent::GetObjects<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, jana::JFactory_base*)
 +
JANA ERROR>>  jana::JFactory<DCDCDigiHit>* jana::JEventLoop::GetFromFactory<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, jana::JEventLoop::data_source_t&, bool)
 +
JANA ERROR>>  jana::JFactory<DCDCDigiHit>* jana::JEventLoop::Get<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, bool)
 +
JANA ERROR>>  JEventProcessor_CDC_em::evnt(jana::JEventLoop*, unsigned long)
 +
JANA ERROR>>  jana::JEventLoop::OneEvent()
 +
JANA ERROR>>  jana::JEventLoop::Loop()
 +
JANA ERROR>>  LaunchThread(void*)
 +
JANA ERROR>>  LaunchThread(void*)
 +
JANA ERROR>>  LaunchThread(void*)
 +
JANA ERROR>>
 +
JANA ERROR>>
 +
 
</pre>
 
</pre>
  

Revision as of 12:35, 31 January 2016


Used the raw samples to emulated the fa125's calculated values and compared them with the fa125 output.
Looked at the first evio files from runs 3293 (2 weeks old) and 4062 (this week).


Number of discrepancies between firmware output and emulation output ('complete events' have Pulse and Raw data present)
Run total events complete events time q amplitude pedestal integral overflow count
3923 16582555 1160 1942 3547 15609 1
3923 minus 2 bad fadcs 15914850 378 1877 288 12798 1
4062 file 000 10249118 10247801 0 0 98 (70 early hits) 0 3 0
4062 file 001 10283167 10282756 3 0 68 (46 early hits) 0 2 1
4062 file 002 10262816 10259658 1 0 65 (41 early hits) 0 5 1
4062 file 003 10248768 10245940 2 0 58 (40 early hits) 1 4 3
4062 file 004 10251199 10249150 0 0 73 (53 early hits) 0 8 2


Many of the problems in 3923 were due to hardware faults in roc28 slots 5&6.

11770 of the 12798 difference in integral were due to faulty assignment of the overflow bit. This has been fixed.

There were a few problems in the firmware which Cody described & fixed before run 4062. The remaining issues are not critical.


Differences in data/emulation from run 4062

The first 4 are firmware logic, & not necessarily a mistake (could be a mistake in the emulator); the remaining 3 are more weird.

1. Integral differences - these are from very late hits where there is a small peak at the end of the hit search window that only just clears the threshold crossing, followed by a larger peak a few samples later. The timing algorithm returns the time for the larger peak and the emulated integral is 0 because it is out of the window. [Cody to fix]

2. Amplitude differences - both firmware and emulation are starting the peak search from the threshold crossing sample but it should really start from the sample containing the leading edge time, since very occasionally the search will pick up a different peak (usually a small one before a larger one). [Cody and Naomi to fix]

3. Amplitude differences - early hits - 70 of the 98 differences are where the samples at the start of the window are over threshold, decrease for one sample and then rise again, ie. hitsample==20 && (adc[20]>=adc[21]) && (adc[21]<adc[22]) where adc[20] is the first sample in the hit search window. Emulator returns maxamp = adc[20]; firmware returns amp of the following peak. [Cody to fix]

4. Amplitude differences - later hits - 28 of the 98 differences seem to have no apparent cause, no association with roc/slot/channel, in most cases the max amp reported is larger than all the sample values [???]

5. Overflow count - the firmware is counting overflows from the hit sample - PG on, but the emulator counts them for the entire data window. Naomi will change the emulator to match the firmware. [Naomi to fix]

6. Missing pulse or WRD - pulse & Window raw data are separate objects in the evio eventloop, not linked yet. One out of sync pair causes the rest of the data for that trigger to be out of step.
[from WRD without straws, all digihits have straws, David to fix w association]

Decoder crashing

hd_root used to segfault when the window raw data contain fewer samples than expected. This has mostly been fixed. Which of the recent run files still make it crash? Will compile a list. I think they were in Run4127. Naomi found hd_rawdata_004062_001.evio to cause a segfault one time, but not to segfault many times.

njarvis  11014 27.0  0.4 664324 105956 pts/8   Sl+  14:30  24:36 hd_root hd_rawdata_004062_001.evio -PPLUGINS=CDC_em -o d4062_001.root

[njarvis@maria: /raid12/gluex/rawdata2/Run003923 ]> screen -r

===========================================================
#14 0x0000003e4700953f in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
#15 0x0000003e47009f0a in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2
#16 0x0000003e4700e0c0 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
#17 0x0000003e470148f5 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
#18 0x0000000000cd9d65 in jana::JApplication::Run (this=0x7ffc9e597ec0, proc=<value optimized out>, Nthreads=<value optimized out>) at src/JANA/JApplication.cc:1688
#19 0x000000000057dda7 in main (narg=3, argv=0x7ffc9e5984f8) at programs/Analysis/hd_root/hd_root.cc:45


fa125 object mismatches

ie when the nth digihit contains a CDCPulseData object for a hit channel differing from that for the nth windowrawdata object. Usually they match. Every now and then a CDCPulseData is missing from the start of an event and then the following subsequent objects are out of step for a while (the mismatches continue into following events and eventually stop). Sometimes there is a lone WRD object first and a lone CDCPulseData at the end, but sometimes the lone objects are not found and only the mismatched pairs are found.
Run 4062 Naomi sees mismatches using a hd_root plugin but Beni does not, using his independent analyzer.
Run 4127 all the CDC pulse hits are paired with FDC window data.


Not enough samples

eg eventnum 6520553 and 6520556 in Run003923. Naomi finds insufficient samples, Beni does not.

missing pulse data

4731 Beni found 2 pulse data words missing from 57 files


recent cosmics data


004003 #files= 14 6 7
>>>> 004039 #files= 14 6 7
>>>> 004044 #files= 2  6 7
>>>> 004062 #files= 21 6 7
>>>> 004101 #files= 22 6 7
>>>> 004296 #files= 5  8 8
>>>> 004593 #files= 2  6 7
>>>> 004594 #files= 2  6 7
>>>> 004595 #files= 2  3 4
>>>> 004597 #files= 57 3 4
>>>> 004701 #files= 4  6 8
>>>> 004706 #files= 2  6 8
>>>> 004710 #files= 5  6 8
>>>> 004711 #files= 8  6 8
>>>> 004715 #files= 2  6 8
>>>> 004717 #files= 4  6 8
>>>> 004718 #files= 5  6 8
>>>> 004731 #files= 59 6 8
>>>> 004745 #files= 2  6 8
>>>> 004746 #files= 8  6 8
>>>> 004747 #files= 45 6 8


run 4044 hd_root ok
1472985 events, no missing pulses, only a few differences from emulated 
Total diffs: 91  time: 0  q: 0  amp: 79  ped: 0  integ: 12  oflow: 0
run 4062 

CDC mode 6, no FDC, hd_root ok

TS_TRIG_HOLD  30  1
BLOCKLEVEL   1
BUFFERLEVEL  1

https://halldweb.jlab.org/rcdb/files/info/2938

has NO unpaired pulse or WRD 
  24425108 events 
 
VERY FEW DIFFERENCES VS EMULATION from all 24425108 events (approx 10x as many hits?)
root [1] CDC->GetEntries()
(const Long64_t)1654
root [2] CDC->GetEntries("d_time")
(Long64_t)30
root [3] CDC->GetEntries("d_q")
(Long64_t)0
root [4] CDC->GetEntries("d_amp")
(Long64_t)1510
root [5] CDC->GetEntries("d_pedestal")
(Long64_t)11
root [6] CDC->GetEntries("d_integral")
(Long64_t)105
root [7] CDC->GetEntries("d_overflows")
(Long64_t)0



004101
JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c619f44
JANA ERROR>>...skipping to 0x0x7fca6c61aa20  (discarding 695 words)
JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c5ed444
JANA ERROR>>...skipping to 0x0x7fca6c5ede04  (discarding 624 words)
004711 hd_root crash
JANA ERROR>>
JANA ERROR>>Stack trace:
JANA ERROR>>
JANA ERROR>>   jana::JException::getStackTrace(bool, unsigned long)
JANA ERROR>>   jana::JException::JException(std::string const&)
JANA ERROR>>   JEventSource_EVIO::ParseF1TDCBank(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
JANA ERROR>>   JEventSource_EVIO::ParseJLabModuleData(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
JANA ERROR>>   JEventSource_EVIO::ParseEVIOEvent(evio::evioDOMTree*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&)
JANA ERROR>>   JEventSource_EVIO::ParseEvents(JEventSource_EVIO::ObjList*)
JANA ERROR>>   JEventSource_EVIO::GetObjects(jana::JEvent&, jana::JFactory_base*)
JANA ERROR>>   jerror_t jana::JEvent::GetObjects<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, jana::JFactory_base*)
JANA ERROR>>   jana::JFactory<DCDCDigiHit>* jana::JEventLoop::GetFromFactory<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, jana::JEventLoop::data_source_t&, bool)
JANA ERROR>>   jana::JFactory<DCDCDigiHit>* jana::JEventLoop::Get<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, bool)
JANA ERROR>>   JEventProcessor_CDC_em::evnt(jana::JEventLoop*, unsigned long)
JANA ERROR>>   jana::JEventLoop::OneEvent()
JANA ERROR>>   jana::JEventLoop::Loop()
JANA ERROR>>   LaunchThread(void*)
JANA ERROR>>   LaunchThread(void*)
JANA ERROR>>   LaunchThread(void*)
JANA ERROR>>
JANA ERROR>>


004718 #files= 5  modes 6 8  hd_root ok  3774539 events 
BLOCKLEVEL   1
BUFFERLEVEL  1

contains FDC data.  lots of unpaired FDC WRD

One unpaired CDC WRD
unpaired WRD eventnum 683052 roc 26 slot 4 chan 17 trig 16804908  ***    
This is from straw 2523 ring 23 straw 177

Lots of small differences in time and pedestal so maybe thresholds were not the usual ones.
Emulator:    
    const Int_t HIT_THRES = 115;   //110 for run 3923, 115 for run 4623
    const Int_t HIGH_THRESHOLD = 100;
    const Int_t LOW_THRESHOLD = 25;
RCDB trigger file:
FADC125_TH          100
FADC125_TL          25

root [14] CDC->GetEntries()
(const Long64_t)3161822
root [15] CDC->GetEntries("d_time")
(Long64_t)2732838
root [16] CDC->GetEntries("d_q")
(Long64_t)146990
root [17] CDC->GetEntries("d_amp")
(Long64_t)147049
root [18] CDC->GetEntries("d_pedestal")
(Long64_t)2554271
root [19] CDC->GetEntries("d_integral")
(Long64_t)1073256
root [20] CDC->GetEntries("d_overflows")
(Long64_t)0