Watch this page in the wiki to subscribe to automatic updates to this status page. |
Active Issue | Blue Waters Emergency Maintenance HPSS ncsa#Nearline, Mar 3 5PM - Mar 4 1AM. |
1 set of Nebula Gluster storage is currently acting badly all instances connected to that Block are having issues. | |
---|---|
The issue with GlusterFS from earlier today has recurred. We currently have an outage of a single set of GlusterFS that is causing causing about 66 instances to be paused at this time. |
Include the keyword "issue" in updates above to trigger actions.
Start | End | What is happening? | What will be affected? |
---|---|---|---|
Start | End | What happened? | What was affected? | Outcome |
---|---|---|---|---|
2017-03-02 1700 | 2017-03-04 0100 | BW hpss emergency outage to apply patch | ncsa#nearline, stores are failing with cache full | |
2017-02-28 1200 | 2017-02-28 1250 | ICC Resource Manager down | User can't submit new jobs or start new jobs | Remove corrupted job file |
2017-02-22 1615 | 2017-02-221815 | Nebula Gluster Issues | All Nebula instances paused while gluster repaired | Nebula is available. |
2017-02-11 1900 | 2017-02-11 2359 | NPCF Power Hit | BW Lustre was down, xdp heat issues. | RTS 2017-02-11 2359 |
2017-02-15 0800 | 2017-02-15 1800 | ICC Scheduled PM | Batch jobs and login nodes access |