Difference between revisions of "GlueX Data Challenge Meeting, December 17, 2012"
(→Minutes) |
(→Minutes) |
||
Line 48: | Line 48: | ||
* '''CMU''': Paul Mattione | * '''CMU''': Paul Mattione | ||
− | * '''JLab''': David Lawrence, Yi Qiang, Elton Smith, Simon Taylor, Beni Zihlmann | + | * '''JLab''': Mark Ito (chair), David Lawrence, Yi Qiang, Dmitry Romanov, Elton Smith, Simon Taylor, Beni Zihlmann |
− | * '''UConn''': | + | * '''UConn''': Richard Jones |
− | + | ==Data Challenge 1 status== | |
− | + | ||
− | + | Production started at the three sites Wednesday, December 5, as planned. | |
− | + | ||
− | + | We updated progress at the various sites: | |
+ | |||
+ | * JLab: 678 million events | ||
+ | * Grid: 3.4 billion events | ||
+ | * CMU: 270 million events | ||
3.4 billion events on grid | 3.4 billion events on grid |
Revision as of 20:14, 17 December 2012
GlueX Data Challenge Meeting
Monday, December 17, 2012
1:30 pm, EDT
JLab: CEBAF Center, F326/327
Contents
Agenda
- Announcements
- Minutes from last time
- Data Challenge 1 status
- JLab
- Grid status
- CMU status
- Shutdown plan (or continuation plan?)
- Work list for post DC-1 period
- file archiving
- file distribution
- ???
- Thoughts on DC-2
- What?
- How much?
- When?
Meeting Connections
To connect from the outside:
Videoconferencing
- ESNET:
- Call ESNET Number 8542553 (this is the preferred connection method).
- EVO:
- A conference has been booked under "GlueX" from 1:00pm until 3:30pm (EST).
- Direct meeting link
- To phone into an EVO meeting, from the U.S. call (626) 395-2112 and then enter the EVO meeting code, 13 9993
- Instructions for the Phone Bridge to EVO.
- Skype Bridge to EVO
Telephone
- Phone: (should not be needed)
- +1-866-740-1260 : US and Canada
- +1-303-248-0285 : International
- then use participant code: 3421244# (the # is needed when using the phone)
- or www.readytalk.com
- then type access code 3421244 into "join a meeting" (you need java plugin)
Minutes
Present:
- CMU: Paul Mattione
- JLab: Mark Ito (chair), David Lawrence, Yi Qiang, Dmitry Romanov, Elton Smith, Simon Taylor, Beni Zihlmann
- UConn: Richard Jones
Data Challenge 1 status
Production started at the three sites Wednesday, December 5, as planned.
We updated progress at the various sites:
- JLab: 678 million events
- Grid: 3.4 billion events
- CMU: 270 million events
3.4 billion events on grid some time correcting problems spared hazzards with crashes
mcsmear, reproduce hang take seeds and re-run on second try files look identical cause of hangs, deadlock due to exceeding 30 second time-out holds mutex lock hangs occur in mcsmear
24 hour jobs partial file, no files
jobs finished quickly 2-3% crashing resubmit on failure multiplie submimission, up to 30 changed to allow failed jobs to fail
submission node crashed, replaced with bigger memory machine peak out at 7k jobs running at once other host: user scheduler, maintains a daemon for each job, needed more memory srm that receives the results coming back, 20 TB of disk robust 100 MB, fills GB pipe
100 million events and go back to debug the code
10% being used right now only one person
archive all files to JLab tape library logs, histos, rest
distribution: ship all rest files to UConn, access via srm have all files spinning at JLab
SURA grid,
skims
srm plug-in
grid certificate, collaboration wide archive
set faujlts in hdgeant jana hangs relaunch random seed
--end of note--