Difference between revisions of "FA125 firmware check"
Line 155: | Line 155: | ||
JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c5ed444 | JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c5ed444 | ||
JANA ERROR>>...skipping to 0x0x7fca6c5ede04 (discarding 624 words) | JANA ERROR>>...skipping to 0x0x7fca6c5ede04 (discarding 624 words) | ||
+ | </pre> | ||
+ | |||
+ | <pre> | ||
+ | 004711 hd_root crash | ||
+ | JANA ERROR>> | ||
+ | JANA ERROR>>Stack trace: | ||
+ | JANA ERROR>> | ||
+ | JANA ERROR>> jana::JException::getStackTrace(bool, unsigned long) | ||
+ | JANA ERROR>> jana::JException::JException(std::string const&) | ||
+ | JANA ERROR>> JEventSource_EVIO::ParseF1TDCBank(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&) | ||
+ | JANA ERROR>> JEventSource_EVIO::ParseJLabModuleData(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&) | ||
+ | JANA ERROR>> JEventSource_EVIO::ParseEVIOEvent(evio::evioDOMTree*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&) | ||
+ | JANA ERROR>> JEventSource_EVIO::ParseEvents(JEventSource_EVIO::ObjList*) | ||
+ | JANA ERROR>> JEventSource_EVIO::GetObjects(jana::JEvent&, jana::JFactory_base*) | ||
+ | JANA ERROR>> jerror_t jana::JEvent::GetObjects<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, jana::JFactory_base*) | ||
+ | JANA ERROR>> jana::JFactory<DCDCDigiHit>* jana::JEventLoop::GetFromFactory<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, jana::JEventLoop::data_source_t&, bool) | ||
+ | JANA ERROR>> jana::JFactory<DCDCDigiHit>* jana::JEventLoop::Get<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, bool) | ||
+ | JANA ERROR>> JEventProcessor_CDC_em::evnt(jana::JEventLoop*, unsigned long) | ||
+ | JANA ERROR>> jana::JEventLoop::OneEvent() | ||
+ | JANA ERROR>> jana::JEventLoop::Loop() | ||
+ | JANA ERROR>> LaunchThread(void*) | ||
+ | JANA ERROR>> LaunchThread(void*) | ||
+ | JANA ERROR>> LaunchThread(void*) | ||
+ | JANA ERROR>> | ||
+ | JANA ERROR>> | ||
+ | |||
</pre> | </pre> | ||
Revision as of 12:35, 31 January 2016
Used the raw samples to emulated the fa125's calculated values and compared them with the fa125 output.
Looked at the first evio files from runs 3293 (2 weeks old) and 4062 (this week).
Run | total events | complete events | time | q | amplitude | pedestal | integral | overflow count |
3923 | 16582555 | 1160 | 1942 | 3547 | 15609 | 1 | ||
3923 minus 2 bad fadcs | 15914850 | 378 | 1877 | 288 | 12798 | 1 | ||
4062 file 000 | 10249118 | 10247801 | 0 | 0 | 98 (70 early hits) | 0 | 3 | 0 |
4062 file 001 | 10283167 | 10282756 | 3 | 0 | 68 (46 early hits) | 0 | 2 | 1 |
4062 file 002 | 10262816 | 10259658 | 1 | 0 | 65 (41 early hits) | 0 | 5 | 1 |
4062 file 003 | 10248768 | 10245940 | 2 | 0 | 58 (40 early hits) | 1 | 4 | 3 |
4062 file 004 | 10251199 | 10249150 | 0 | 0 | 73 (53 early hits) | 0 | 8 | 2 |
Many of the problems in 3923 were due to hardware faults in roc28 slots 5&6.
11770 of the 12798 difference in integral were due to faulty assignment of the overflow bit. This has been fixed.
There were a few problems in the firmware which Cody described & fixed before run 4062. The remaining issues are not critical.
Contents
Differences in data/emulation from run 4062
The first 4 are firmware logic, & not necessarily a mistake (could be a mistake in the emulator); the remaining 3 are more weird.
1. Integral differences - these are from very late hits where there is a small peak at the end of the hit search window that only just clears the threshold crossing, followed by a larger peak a few samples later. The timing algorithm returns the time for the larger peak and the emulated integral is 0 because it is out of the window. [Cody to fix]
2. Amplitude differences - both firmware and emulation are starting the peak search from the threshold crossing sample but it should really start from the sample containing the leading edge time, since very occasionally the search will pick up a different peak (usually a small one before a larger one). [Cody and Naomi to fix]
3. Amplitude differences - early hits - 70 of the 98 differences are where the samples at the start of the window are over threshold, decrease for one sample and then rise again, ie. hitsample==20 && (adc[20]>=adc[21]) && (adc[21]<adc[22]) where adc[20] is the first sample in the hit search window. Emulator returns maxamp = adc[20]; firmware returns amp of the following peak. [Cody to fix]
4. Amplitude differences - later hits - 28 of the 98 differences seem to have no apparent cause, no association with roc/slot/channel, in most cases the max amp reported is larger than all the sample values [???]
5. Overflow count - the firmware is counting overflows from the hit sample - PG on, but the emulator counts them for the entire data window. Naomi will change the emulator to match the firmware. [Naomi to fix]
6. Missing pulse or WRD - pulse & Window raw data are separate objects in the evio eventloop, not linked yet. One out of sync pair causes the rest of the data for that trigger to be out of step.
[from WRD without straws, all digihits have straws, David to fix w association]
Decoder crashing
hd_root used to segfault when the window raw data contain fewer samples than expected. This has mostly been fixed. Which of the recent run files still make it crash? Will compile a list. I think they were in Run4127. Naomi found hd_rawdata_004062_001.evio to cause a segfault one time, but not to segfault many times.
njarvis 11014 27.0 0.4 664324 105956 pts/8 Sl+ 14:30 24:36 hd_root hd_rawdata_004062_001.evio -PPLUGINS=CDC_em -o d4062_001.root [njarvis@maria: /raid12/gluex/rawdata2/Run003923 ]> screen -r =========================================================== #14 0x0000003e4700953f in do_lookup_x () from /lib64/ld-linux-x86-64.so.2 #15 0x0000003e47009f0a in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2 #16 0x0000003e4700e0c0 in _dl_fixup () from /lib64/ld-linux-x86-64.so.2 #17 0x0000003e470148f5 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2 #18 0x0000000000cd9d65 in jana::JApplication::Run (this=0x7ffc9e597ec0, proc=<value optimized out>, Nthreads=<value optimized out>) at src/JANA/JApplication.cc:1688 #19 0x000000000057dda7 in main (narg=3, argv=0x7ffc9e5984f8) at programs/Analysis/hd_root/hd_root.cc:45
fa125 object mismatches
ie when the nth digihit contains a CDCPulseData object for a hit channel differing from that for the nth windowrawdata object. Usually they match. Every now and then a CDCPulseData is missing from the start of an event and then the following subsequent objects are out of step for a while (the mismatches continue into following events and eventually stop). Sometimes there is a lone WRD object first and a lone CDCPulseData at the end, but sometimes the lone objects are not found and only the mismatched pairs are found.
Run 4062 Naomi sees mismatches using a hd_root plugin but Beni does not, using his independent analyzer.
Run 4127 all the CDC pulse hits are paired with FDC window data.
Not enough samples
eg eventnum 6520553 and 6520556 in Run003923. Naomi finds insufficient samples, Beni does not.
missing pulse data
4731 Beni found 2 pulse data words missing from 57 files
recent cosmics data
004003 #files= 14 6 7 >>>> 004039 #files= 14 6 7 >>>> 004044 #files= 2 6 7 >>>> 004062 #files= 21 6 7 >>>> 004101 #files= 22 6 7 >>>> 004296 #files= 5 8 8 >>>> 004593 #files= 2 6 7 >>>> 004594 #files= 2 6 7 >>>> 004595 #files= 2 3 4 >>>> 004597 #files= 57 3 4 >>>> 004701 #files= 4 6 8 >>>> 004706 #files= 2 6 8 >>>> 004710 #files= 5 6 8 >>>> 004711 #files= 8 6 8 >>>> 004715 #files= 2 6 8 >>>> 004717 #files= 4 6 8 >>>> 004718 #files= 5 6 8 >>>> 004731 #files= 59 6 8 >>>> 004745 #files= 2 6 8 >>>> 004746 #files= 8 6 8 >>>> 004747 #files= 45 6 8
run 4044 hd_root ok 1472985 events, no missing pulses, only a few differences from emulated Total diffs: 91 time: 0 q: 0 amp: 79 ped: 0 integ: 12 oflow: 0
run 4062 CDC mode 6, no FDC, hd_root ok TS_TRIG_HOLD 30 1 BLOCKLEVEL 1 BUFFERLEVEL 1 https://halldweb.jlab.org/rcdb/files/info/2938 has NO unpaired pulse or WRD 24425108 events VERY FEW DIFFERENCES VS EMULATION from all 24425108 events (approx 10x as many hits?) root [1] CDC->GetEntries() (const Long64_t)1654 root [2] CDC->GetEntries("d_time") (Long64_t)30 root [3] CDC->GetEntries("d_q") (Long64_t)0 root [4] CDC->GetEntries("d_amp") (Long64_t)1510 root [5] CDC->GetEntries("d_pedestal") (Long64_t)11 root [6] CDC->GetEntries("d_integral") (Long64_t)105 root [7] CDC->GetEntries("d_overflows") (Long64_t)0
004101 JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c619f44 JANA ERROR>>...skipping to 0x0x7fca6c61aa20 (discarding 695 words) JANA ERROR>>Unknown module type (15) iptr=0x0x7fca6c5ed444 JANA ERROR>>...skipping to 0x0x7fca6c5ede04 (discarding 624 words)
004711 hd_root crash JANA ERROR>> JANA ERROR>>Stack trace: JANA ERROR>> JANA ERROR>> jana::JException::getStackTrace(bool, unsigned long) JANA ERROR>> jana::JException::JException(std::string const&) JANA ERROR>> JEventSource_EVIO::ParseF1TDCBank(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&) JANA ERROR>> JEventSource_EVIO::ParseJLabModuleData(int, unsigned int const*&, unsigned int const*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&) JANA ERROR>> JEventSource_EVIO::ParseEVIOEvent(evio::evioDOMTree*, std::list<JEventSource_EVIO::ObjList*, std::allocator<JEventSource_EVIO::ObjList*> >&) JANA ERROR>> JEventSource_EVIO::ParseEvents(JEventSource_EVIO::ObjList*) JANA ERROR>> JEventSource_EVIO::GetObjects(jana::JEvent&, jana::JFactory_base*) JANA ERROR>> jerror_t jana::JEvent::GetObjects<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, jana::JFactory_base*) JANA ERROR>> jana::JFactory<DCDCDigiHit>* jana::JEventLoop::GetFromFactory<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, jana::JEventLoop::data_source_t&, bool) JANA ERROR>> jana::JFactory<DCDCDigiHit>* jana::JEventLoop::Get<DCDCDigiHit>(std::vector<DCDCDigiHit const*, std::allocator<DCDCDigiHit const*> >&, char const*, bool) JANA ERROR>> JEventProcessor_CDC_em::evnt(jana::JEventLoop*, unsigned long) JANA ERROR>> jana::JEventLoop::OneEvent() JANA ERROR>> jana::JEventLoop::Loop() JANA ERROR>> LaunchThread(void*) JANA ERROR>> LaunchThread(void*) JANA ERROR>> LaunchThread(void*) JANA ERROR>> JANA ERROR>>
004718 #files= 5 modes 6 8 hd_root ok 3774539 events BLOCKLEVEL 1 BUFFERLEVEL 1 contains FDC data. lots of unpaired FDC WRD One unpaired CDC WRD unpaired WRD eventnum 683052 roc 26 slot 4 chan 17 trig 16804908 *** This is from straw 2523 ring 23 straw 177 Lots of small differences in time and pedestal so maybe thresholds were not the usual ones. Emulator: const Int_t HIT_THRES = 115; //110 for run 3923, 115 for run 4623 const Int_t HIGH_THRESHOLD = 100; const Int_t LOW_THRESHOLD = 25; RCDB trigger file: FADC125_TH 100 FADC125_TL 25 root [14] CDC->GetEntries() (const Long64_t)3161822 root [15] CDC->GetEntries("d_time") (Long64_t)2732838 root [16] CDC->GetEntries("d_q") (Long64_t)146990 root [17] CDC->GetEntries("d_amp") (Long64_t)147049 root [18] CDC->GetEntries("d_pedestal") (Long64_t)2554271 root [19] CDC->GetEntries("d_integral") (Long64_t)1073256 root [20] CDC->GetEntries("d_overflows") (Long64_t)0