Difference between revisions of "Topics for the 2015 Software Review"

From GlueXWiki
Jump to: navigation, search
(Preferred Format)
 
(14 intermediate revisions by 2 users not shown)
Line 1: Line 1:
* data acquisitions successes
+
__TOC__
** transfer bandwidth
+
== Overall Theme of the Presentation(s)==
** Event rates, how to improve
+
GlueX ran very successfully in from early November until late December of 2014. All aspects of the DAQ and offline software systems were stressed and no show-stoppers were identified. Many issues that could only be found and fixed under a real-data environment were identified and repaired, and based on the experience, a well-defined plan for moving forward to the April 2015 run is identified and being implemented.
** Event sizes, how to improve
+
* Procedures for tuning the beam into the experiment were developed.
** rosbustness of system
+
* The level-one (hardware) trigger was implemented and expanded through the run.
** Handle corrupted evio data
+
* Beam was successfully delivered to the GlueX detector, and with the very first events, tracks were being reconstructed and clusters identified in the calorimeters.
* Revist data spreadsheet from a year ago
+
* Data taken at up to 600MB/s over one weekend. Expected maximum rate is 1/2 that. System ran well.
* offline monitoring
+
* Online vertex reconstruction from tracks was consistent with that expected from the beam tune. [..consistent with beam line configuration? MMI]
** browser
+
* dE/dx was useful almost immediately.[?]
** analyze data as it appears on the silo
+
* All detectors reported in the same event.
** reconstruction results
+
* pi-0 were identified in the forward calorimeter in the first of the largish data runs.
* calibration committee
+
* Online monitoring of all detectors ran very well. [online monitoring had issues throughout, no? MMI]
** bi-weekly meeting
+
* Offline running of monitoring processes ran well and agreed with online. [confirm with the online guys? MMI]
** preliminary list of constants compiled
+
* Run cataloging via data base was implemented (did not expect enough data to merit this at first).
** calibration still needs to be regularized
+
* Photons were successfully tagged and correlated with the detector events.
** calibration database training
+
* Calibration updates were loaded into the data base.
* CCDB successes
+
* Active calibration efforts are now ongoing for all detectors.
** command line interface
+
* pi0s now easily seen in both calorimeters.
** SQLite form of database
+
* rho->pi+pi- seen, omega -> pi+ pi- p0 seen.
* analysis results
+
* Skimming software for events is running.
** electron identification in the FCAL.
+
* Data production is working. [alternate wording: large-scale event reconstruction executed and repeated?]
** pi0 peak
+
* All events are being regularly processed with updated calibrations
** proton id with tof
+
* Data are regularly pushed offsite using globus-ftp.
** rho meson in pi+ pi-
+
* The collaboration feels that we reached a number of milestones that we did not expect to see until we were well into the April 2015 run.
** omega meson in pi+ pi- pi0
+
* The system ran in FADC (raw pulse) modes for much larger data samples than we expected would be possible.
* data transfer to CMU via globus Online
+
 
* data management: event store, etc.
+
== Preferred Format ==
* stealth data challenge on real data
+
Given that this is the 3rd software review, we will that the entire committee
 +
would like to see where things stand. As such, we would prefer to only have a
 +
plenary presentation. Other disadvantages to break-out sessions include:
 +
# The committee may be smaller than usual, so splitting may result in too few reviewers per session.
 +
# This being the third review, the need for introductory material should be significantly reduced.
 +
# Last time, there was significant repetition between plenary and break-out presentations, particularly for the computer center. In addition, the computer center report was parallel to the two Halls (B+D) that most heavily use the central resources.
 +
# No one has expressed reservations about a plenary-only format.
 +
 
 +
== Topics to be Covered ==
 +
# Report on Successful Data Challenges
 +
#* DC1 - December 2012/ January 2013
 +
#** 5 billion Events - OSG, JLab, CMU
 +
#** 1200 Concurrent Jobs at JLab.
 +
#* DC2 - March/April 2014
 +
#** 10 billion events with EM backgrounds included - OSG, JLab, MIT, CMU, FSU
 +
#** 4500 Concurrent Jobs at JLab
 +
#** Well under 0.1% failure rate
 +
#* DC3 - January/February 2015
 +
#** Read data in raw-event format from tape and produce DST (REST) files.
 +
#** Load up as many JLab cores as possible.
 +
#** Run Multi-threaded jobs
 +
#** Already doing full reprocessing of the Fall 2014 data from tape every two weeks.
 +
# Data Acquisition Successes - Running Fall 2014 (stealth data challenge).
 +
#* Exceeded the 300MB/s transfer to tape bandwidth of experiment.
 +
#** ~500 million events.
 +
#** 7000 files, 120TB of data
 +
#* Most data were taken in full pulse mode of the Flash ADCs
 +
#** Need to get final processing algorithms on the FPGAs in the FADCs
 +
#** Need to clean raw data of massive unused headers.
 +
#* Event Rates of 2KHz for full experiment, much higher for individual components.
 +
#** Need to move to block mode.
 +
#** Need to move to FPGA processing to compress data.
 +
#* Full DAQ chain to local raid disk, transfer to tape, and automatic processing from tape.
 +
#* Robustness issues with the system
 +
#** Handle corrupted evio data
 +
#** Problems with some FADCs getting out of sync.
 +
#* Stealth Online Data Challenge
 +
# Revisit data and computing spreadsheets
 +
#* Update based on current software performance.
 +
#* Update with best estimates of raw data footprint.
 +
# offline monitoring
 +
#* browser
 +
#* analyze data as it appears on the silo
 +
#* reconstruction results
 +
# calibration committee
 +
#* bi-weekly meeting
 +
#* preliminary list of constants compiled
 +
#* calibration still needs to be regularized
 +
#* calibration database training
 +
# CCDB successes
 +
#* command line interface
 +
#* SQLite form of database
 +
# analysis results
 +
#* electron identification in the FCAL.
 +
#* pi0 peak
 +
#* proton id with tof
 +
#* proton id with dE/dx
 +
#* rho meson in pi+ pi-
 +
#* omega meson in pi+ pi- pi0
 +
# data transfer to CMU via globus Online
 +
# data management: event store, etc.

Latest revision as of 09:13, 19 January 2015

Overall Theme of the Presentation(s)

GlueX ran very successfully in from early November until late December of 2014. All aspects of the DAQ and offline software systems were stressed and no show-stoppers were identified. Many issues that could only be found and fixed under a real-data environment were identified and repaired, and based on the experience, a well-defined plan for moving forward to the April 2015 run is identified and being implemented.

  • Procedures for tuning the beam into the experiment were developed.
  • The level-one (hardware) trigger was implemented and expanded through the run.
  • Beam was successfully delivered to the GlueX detector, and with the very first events, tracks were being reconstructed and clusters identified in the calorimeters.
  • Data taken at up to 600MB/s over one weekend. Expected maximum rate is 1/2 that. System ran well.
  • Online vertex reconstruction from tracks was consistent with that expected from the beam tune. [..consistent with beam line configuration? MMI]
  • dE/dx was useful almost immediately.[?]
  • All detectors reported in the same event.
  • pi-0 were identified in the forward calorimeter in the first of the largish data runs.
  • Online monitoring of all detectors ran very well. [online monitoring had issues throughout, no? MMI]
  • Offline running of monitoring processes ran well and agreed with online. [confirm with the online guys? MMI]
  • Run cataloging via data base was implemented (did not expect enough data to merit this at first).
  • Photons were successfully tagged and correlated with the detector events.
  • Calibration updates were loaded into the data base.
  • Active calibration efforts are now ongoing for all detectors.
  • pi0s now easily seen in both calorimeters.
  • rho->pi+pi- seen, omega -> pi+ pi- p0 seen.
  • Skimming software for events is running.
  • Data production is working. [alternate wording: large-scale event reconstruction executed and repeated?]
  • All events are being regularly processed with updated calibrations
  • Data are regularly pushed offsite using globus-ftp.
  • The collaboration feels that we reached a number of milestones that we did not expect to see until we were well into the April 2015 run.
  • The system ran in FADC (raw pulse) modes for much larger data samples than we expected would be possible.

Preferred Format

Given that this is the 3rd software review, we will that the entire committee would like to see where things stand. As such, we would prefer to only have a plenary presentation. Other disadvantages to break-out sessions include:

  1. The committee may be smaller than usual, so splitting may result in too few reviewers per session.
  2. This being the third review, the need for introductory material should be significantly reduced.
  3. Last time, there was significant repetition between plenary and break-out presentations, particularly for the computer center. In addition, the computer center report was parallel to the two Halls (B+D) that most heavily use the central resources.
  4. No one has expressed reservations about a plenary-only format.

Topics to be Covered

  1. Report on Successful Data Challenges
    • DC1 - December 2012/ January 2013
      • 5 billion Events - OSG, JLab, CMU
      • 1200 Concurrent Jobs at JLab.
    • DC2 - March/April 2014
      • 10 billion events with EM backgrounds included - OSG, JLab, MIT, CMU, FSU
      • 4500 Concurrent Jobs at JLab
      • Well under 0.1% failure rate
    • DC3 - January/February 2015
      • Read data in raw-event format from tape and produce DST (REST) files.
      • Load up as many JLab cores as possible.
      • Run Multi-threaded jobs
      • Already doing full reprocessing of the Fall 2014 data from tape every two weeks.
  2. Data Acquisition Successes - Running Fall 2014 (stealth data challenge).
    • Exceeded the 300MB/s transfer to tape bandwidth of experiment.
      • ~500 million events.
      • 7000 files, 120TB of data
    • Most data were taken in full pulse mode of the Flash ADCs
      • Need to get final processing algorithms on the FPGAs in the FADCs
      • Need to clean raw data of massive unused headers.
    • Event Rates of 2KHz for full experiment, much higher for individual components.
      • Need to move to block mode.
      • Need to move to FPGA processing to compress data.
    • Full DAQ chain to local raid disk, transfer to tape, and automatic processing from tape.
    • Robustness issues with the system
      • Handle corrupted evio data
      • Problems with some FADCs getting out of sync.
    • Stealth Online Data Challenge
  3. Revisit data and computing spreadsheets
    • Update based on current software performance.
    • Update with best estimates of raw data footprint.
  4. offline monitoring
    • browser
    • analyze data as it appears on the silo
    • reconstruction results
  5. calibration committee
    • bi-weekly meeting
    • preliminary list of constants compiled
    • calibration still needs to be regularized
    • calibration database training
  6. CCDB successes
    • command line interface
    • SQLite form of database
  7. analysis results
    • electron identification in the FCAL.
    • pi0 peak
    • proton id with tof
    • proton id with dE/dx
    • rho meson in pi+ pi-
    • omega meson in pi+ pi- pi0
  8. data transfer to CMU via globus Online
  9. data management: event store, etc.