...
START | END | What System/Service is affected | What is happening? | What will be affected? | Actions | |||||
---|---|---|---|---|---|---|---|---|---|---|
2018-06-12 | Nebula | A storage node crashed, possibly from the thunderstorms. | Instances may be slow while the filesystem heals. | 19 | 2018-06-19 17:00 | Nebula | Nebula is undergoing a complete reboot. Last week's storms damaged more than just one node initially thought to be affected. | Nebula will be unavailable until 5pm. | Shutting down/rebooting all portions of Nebula clusterOnce the storage node is back online, the filesystem will heal itself. | |
2018-05-03 14:30 | iForge gpu queue | both nodes in the general 'gpu' queue are offline due to issues with the GPUs | iForge 'gpu' queue cannot be used | Driver updates, ticket with vendor |
...
Start | End | What System/Service is affected | What is happening? | What will be affected? | Contact Person | 2018-06-19 10:00 | 2018-06-19 | All Nebula functions | The entire system needs to be rebooted. Last week's storms damaged more than just one node initially thought to be affected. | All Nebula services | nebula@ncsa.illinois.edu|
---|---|---|---|---|---|---|---|---|---|---|---|
2018-06-15 1330hrs | 2018-06-15 1530hrs | Blue Waters Nearline | Replacement of a tape robot transporter | This work is not expected to impact operations. The library system will continue to operate with a single transporter but mount times may be somewhat longer until the second unit is returned to service. | hpssadmin@ncsa.illinois.edu | ||||||
2018-06-07 06:30 | 2018-06-07 14:00 | Blue Waters | The boot node crashed requiring the system to be rebooted. File system and ESLogins remain up. | All running jobs were lost, no new jobs were started until system is return to service, Torque was updated to ver. 6.1.2. | bw-admin@ncsa.illinois.edu | ||||||
2018-06-19 08:00 | 2018-06-19 12:00 | LSST L1 Test Stand | Scheduled Maintenance:
| Level One Test Stand, including:
| lsst-sysadm@ncsa.illinois.edu | ||||||
2018-06-21 08:00 | 2018-06-21 10:00 | LSST | Monthly maintenance (May):
| CentOS 6.9 servers:
Slurm/verification cluster Other impact is not expected but unexpected issues could lead to connectivity issues for other hosts or downtime for lsst-dev01 or hosted VMs | lsst-sysadm@ncsa.illinois.edu |
...