Difference between revisions of "MIT/FutureGrid Data Challenge 2 Production"
From GlueXWiki
(7 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
== Resources == | == Resources == | ||
* [[ Openstack_at_MIT_Overview | MIT Reuse Cluster ]] providing 17 blades x 8 = 136 cores currently. | * [[ Openstack_at_MIT_Overview | MIT Reuse Cluster ]] providing 17 blades x 8 = 136 cores currently. | ||
− | * [https://www.futuregrid.org FutureGrid] project currently providing us with ~ | + | * [https://www.futuregrid.org FutureGrid] project currently providing us with ~200 cores at various sites: |
** University of Chicago | ** University of Chicago | ||
** Indiana University | ** Indiana University | ||
Line 11: | Line 11: | ||
== Monitoring == | == Monitoring == | ||
− | All VMs monitored via Ganglia at [http://reuse37.lns.mit.edu http://reuse37.lns.mit.edu] | + | * All VMs monitored via Ganglia at [http://reuse37.lns.mit.edu http://reuse37.lns.mit.edu] |
+ | |||
+ | == Update 3/28/14 == | ||
+ | * Running smoothly for the last week with 300+ cores | ||
+ | |||
+ | [[File:MIT DC2 3.28.14.png]] | ||
+ | |||
+ | * Only running jobs for 9001 thus far: | ||
+ | ** ~5K jobs complete with 25K events each -> ~125M events produced in 1 week | ||
+ | * May get access to ~100 more cores on FutureGrid | ||
+ | ** Will run jobs for 9002 and 9003 on those and/or adjust some of the nodes currently in use | ||
+ | |||
+ | == Update 4/4/14 == | ||
+ | * Some (monthly) maintenance on FutureGrid sites this past week slowed us down a bit. | ||
+ | * We're up to a total of 344 cores now, running both 9001 and 9002. | ||
+ | * Possibility of ~100 more next week. will run 9003 on these, or switch some of the current VMs. | ||
+ | |||
+ | [[File:MIT DC2 4.4.14.png]] | ||
+ | |||
+ | == Update 4/11/14 == | ||
+ | * Back to running smoothly after maintenance last week | ||
+ | * One FutureGrid site requested we slow production so other users could have some more cycles | ||
+ | * Slides on cloud development from all hands meeting [https://indico.fnal.gov/getFile.py/access?contribId=13&sessionId=7&resId=0&materialId=slides&confId=7207 Jan Balewski @ OSG] | ||
+ | * Data from all three run numbers (9001-9003) now available on Northwestern's SRM [https://mailman.jlab.org/pipermail/halld-offline/2014-April/001638.html Sean's e-mail] | ||
+ | |||
+ | [[File:MIT DC2 4.11.14.png]] |
Latest revision as of 11:02, 11 April 2014
Resources
- MIT Reuse Cluster providing 17 blades x 8 = 136 cores currently.
- FutureGrid project currently providing us with ~200 cores at various sites:
- University of Chicago
- Indiana University
- UC San Diego
- University of Texas - Austin
- VMs launched using tools developed with the FutureGrid project using OpenStack technology, exploring the distributed cloud computing model.
Monitoring
- All VMs monitored via Ganglia at http://reuse37.lns.mit.edu
Update 3/28/14
- Running smoothly for the last week with 300+ cores
- Only running jobs for 9001 thus far:
- ~5K jobs complete with 25K events each -> ~125M events produced in 1 week
- May get access to ~100 more cores on FutureGrid
- Will run jobs for 9002 and 9003 on those and/or adjust some of the nodes currently in use
Update 4/4/14
- Some (monthly) maintenance on FutureGrid sites this past week slowed us down a bit.
- We're up to a total of 344 cores now, running both 9001 and 9002.
- Possibility of ~100 more next week. will run 9003 on these, or switch some of the current VMs.
Update 4/11/14
- Back to running smoothly after maintenance last week
- One FutureGrid site requested we slow production so other users could have some more cycles
- Slides on cloud development from all hands meeting Jan Balewski @ OSG
- Data from all three run numbers (9001-9003) now available on Northwestern's SRM Sean's e-mail