status.ncsa.illinois.edu
Watch this page in the wiki to subscribe to automatic updates to this status page.
Please do not refer to any NCSA Industry Partners on this page. Please use the iforge nomenclature for all of the *forge infrastructure.
To see older events, see Archive of NCSA Status Home
Report a problem
Current Status
START | END | What System/Service is affected | What is happening? | What will be affected? | Contact Person | Status |
---|---|---|---|---|---|---|
Upcoming Scheduled Maintenance
Listed below in chronological order.
Start | End | What System/Service is affected | What is happening? | What will be affected? | Contact Person | Status |
---|---|---|---|---|---|---|
2021-10-20 0800 | 2021-10-20 2000 | ICCP | ICCP Quarterly Maintenance
| ICCP Cluster nodes only | help@campuscluster.illinois.edu | SCHEDULED |
2021-12-09 0800 | 2021-12-09 1200 | LSST | LSST Quarterly Maintenance
| All LSST services hosted at NCSA | lsst-admin@ncsa.illinois.edu | SCHEDULED |
2021-10-19 0700 | 2021-10-19 0730 | IDDS | IDDS maintenance (puppet changes) | All IDDS services | idds-admin@ncsa.illinois.edu | SCHEDULED |
Previous Outages or Maintenance
Start | End | What System/Service was affected? | What happened? | What was affected? | Contact Person | Status |
---|---|---|---|---|---|---|
2021-10-15 1230 | 2021-10-15 0713 | NCSA GitLab | Server ran out of disk space | All GitLab services were unavailable | help+service@ncsa.illinois.edu | RESOLVED |
2021-10-11 0800 | 2021-10-11 1900 | Nightingale, ACHE | Planned maintenance on the Nightingale cluster and the ache-dist switch | There was an outage for the following services during the maintenance:
| help+service@ncsa.illinois.edu | COMPLETE |
2021-10-04 1000 | 2021-10-04 1005 | www.ncsa.illinois.edu per-user web directories | Per-user web directories on the main NCSA website are being redirected to a new website dedicated to per-user web directories. | URLs like www.ncsa.ncsa.illinois.edu/People/* are redirected to their new home at https://users.ncsa.illinois.edu/*. | help+service@ncsa.illinois.edu | COMPLETE |
2021-09-30 0800 | 2021-09-30 1200 | LSST | LSST Quarterly Maintenance
| All LSST services hosted at NCSA | lsst-admin@ncsa.illinois.edu | COMPLETE |
2021-09-29 0800 | 2021-09-29 0900 | cilogon.org | Update to OA4MP v5.2.2 | Update Java database libraries, and address several small issues | help@cilogon.org | COMPLETE |
2021-09-29 0800 | 2021-09-29 0813 | CMDB / openDCIM | Installing/upgrading to CMDB release Sep2021 | The openDCIM front end of CMDB will be down for 15-30 minutes | COMPLETE | |
2021-09-28 0700 | 2021-09-28 1554 | NPCF work on facility power | Deenergizing power to transformer TX-4C-1020, pulling and terminating busduct cabling from transformer to room 2020. | One third of Sonexion racks will lose source 1 power (Feed C) and will continue to operate on source2 degrading reliability by losing power redundancy. | COMPLETE | |
2021-09-28 0700 | 2021-09-28 0900 | Blue Waters | A rack of scratch lost power during the power outage. | Scratch was partially unavailable due to TOR power resiliency issue. | COMPLETE | |
2021-09-28 0800 | 2021-09-28 0900 | idp.ncsa.illinois.edu | Assert eduPersonAssurance Cappuccino profile for NCSA Staff | NCSA Staff logging in with the NCSA Identity Provider will be able to get Silver CA certificates from cilogon.org | help+idp@ncsa.illinois.edu | COMPLETE |
2021-09-21-14:50 | 2021-09-21-15:02 | vcenter appliance controlling ASD vsphere | vcenter appliance was upgraded | vsphere.ncsa.illinois.edu was off-line for 12 minutes. | help+service@ncsa.illinois.edu | COMPLETE |
2021-09-21 0700 | 2021-09-20 1115 | Blue Waters | Power Work caused non redundant switches and misconfigured servers to shutoff | Blue Waters Compute, Login and Scheduler | bw-admin@ncsa.illinois.edu | COMPLETE |
2021-09-20 1800 | 2021-09-20 2130 | NCSA File & Print Servers | Scheduled Windows Server Maintenance | File & Print Shares were unavailable during maintenance. Users were not be able to access shares on Fileserver (e.g. home, busnoff, hr, etc.), and printing was unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2021-09-14 0000 | 2021-09-14 0600 | Internet2 WAN circuit | Internet2 will be migrating our WAN circuit to new hardware. | Traffic over that path will reroute while the change happens. We anticipate the migration to take less than 30 mins. | help+neteng@ncsa.illinois.edu | SCHEDULED |
0600 | 0900 | Wiki | Upgrade to next version | Wiki will be unavailable | COMPLETE | |
2021-09-09 0600 | 2021-09-09 0700 | NCSA VPN | Software Upgrades | The appliances hosting the NCSA VPN will be patched. Users will experience a brief disconnect as load is failed over between the appliances. | help+neteng@ncsa.illinois.edu | COMPLETE |
2021-09-08 1300 | 2021-09-08 1400 | Group prod_b Bastion hosts | Out of cycle patching | Bastion hosts in group prod_b will be patched and rebooted. (see MOTD for group assignment) | help+security@ncsa.illinois.edu | COMPLETE |
2021-09-08 0900 | 2021-09-08 1000 | Group prod_a Bastion hosts | Out of cycle patching | Bastion hosts in group prod_a will be patched and rebooted. (see MOTD for group assignment) | help+security@ncsa.illinois.edu | COMPLETE |
2021-09-02 9:30 AM | 2021-09-02 1PM | PDU in rack AA81 | We are replacing a PDU in NPCF rack AA81 | All systems in the rack have redundant power connections. No service outages are expected from this work | help+service@ncsa.illinois.edu | COMPLETE |
2021-09-01 0700 | 2021-09-01 0800 | cilogon.org | Update to OA4MP v5.2.1 | Device Authorization Grant Flow transactions will be stored in database rather than in memory | help@cilogon.org | COMPLETE |
1200 | 1205 | Wiki | Security patch is being applied | Wiki will be down | help+service@ncsa.illinois.edu | SCHEDULED |
2021-08-25 9:00am | 2021-08-25 6:45pm | Blue Waters | System reboot due to blade fallout coinciding with HSN reroute and SMW not recovering. | All jobs interrupted | jenos@illinois.edu | COMPLETE |
2021-08-19 0538 | 2021-08-19 0700 | IRST systems hosted on IRST Node 2 | Storage controller failure, all VMs taken offline | some prod_b systems, and non-redundant services. | eyrich@illinois.edu | RESOLVED |
2021-08-19 5:34 | 2021-08-19 6:20 | cilogon.org | Storage controller failure in IRST VM farm | cilogon.org was unreachable until we initiated fail-over to our backup servers at NICS. | help@cilogon.org | COMPLETE |
2021-08-18 1136 | 2021-08-18 1156 | NCSA Wiki | Test instance caused interference. | NCSA Wiki | help+service@ncsa.illinois.edu | COMPLETE |
2021-08-17 0500 | 2021-08-17 0700 | NCSA/NPCF Wide Area Network | Between 5:00AM and 7:00 AM CDT on 08/17/2021, Campus ICCN Engineers will be upgrading firmware on the ICCN router 710rtr at the Starlight facility in Chicago. | Our peerings with MREN and OmniPoP will go down. All traffic destined for those peerings will reroute via other peerings, so no production impact is expected. | help+neteng@ncsa.illinois.edu | COMPLETE |
2021-08-16 1800 | 2021-08-17 0000 | NCSA File & Print Servers | Scheduled Windows Server Maintenance | File & Print Shares will be unavailable during maintenance. Users will not able to access shares on Fileserver (e.g. home, busnoff, hr, etc.), and printing will be unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2021-08-12 9:54 | 2021-08-12 1012 | Jira | Attempted snapshot of Jira in vSphere was too intensive for the system | Jira | help+service@illinois.edu | COMPLETE |
2021-08-10 2000 | 2021-08-011 0000 | Radiant API and Web access | Radiant cluster name change. | During this time access to the API endpoints and the Horizon web dashboard will be intermittently unavailable. Instances will continue to run and be available over the network with no interruptions. | radiant-admin@ncsa.illinois.edu | COMPLETE |
2021-08-10 07:00 | 2021-08-10 17:10 | iForge | Quarterly Maintenance | All systems unavailable | iforge-admin@lists.ncsa.illinois.edu | COMPLETE |
2021-08-09 1421 | 2021-08-09 1440 | NCSA Wiki | DB conflict configuration with Wiki & Wiki-Test | NCSA Wiki was unaccessible | help+service@ncsa.illinois.edu | COMPLETE |
2021-08-05 1000 | 2021-08-05 1030 | NPCF Core Router - Linecard Reboot | A problem was identified on one of the line cards in our core router requiring a reboot of the linecard. The linecard was successfully rebooted and we will continue monitoring the hardware for further issues. | All connections to this linecard are redundant and there was no impact to users. | neteng@ncsa.illinois.edu | COMPLETE |
2021-08-05 0800 | 2021-08-05 1000 | LSST | LSST Emergency OS Patching | LSST services hosted at NCSA except:
| lsst-admin@ncsa.illinois.edu | COMPLETE |
2021-08-04 0800 | 2021-08-04 1700 | Radiant API and Web access | Installation of new Radiant cluster Cluster name changes are starting at 1100; This will make the horizon dashboard unreachable. | During this time access to the API endpoints and the Horizon web dashboard will be intermittently unavailable. Instances will continue to run and be available over the network with no interruptions. | radiant-admin@ncsa.illinois.edu | COMPLETED |
2021-08-04 0700 | 2021-08-04 0800 | cilogon.org | Update to OA4MP v5.2.0 | Added support for Device Authorization Grant Flow (RFC 8628) | help@cilogon.org | COMPLETED |
2021-08-03 0800 | 2021-08-03 1700 | Radiant API and Web access | Installation of new Radiant cluster | During this time access to the API endpoints and the Horizon web dashboard will be intermittently unavailable. Instances will continue to run and be available over the network with no interruptions. | radiant-admin@ncsa.illinois.edu | COMPLETED |
2021-08-03 9:00 am | 2021-08-03 11:30 am | Radiant Cluster | A change was made to the firewall that unintentionally restricted access for instances and other internal cluster communication. | Access to instances and workload | radiant-admin@ncsa.illinois.edu | RESOLVED |
2021-07-31 0600 | 2021-07-31 0630 | CILogon hosted services | Infrastructure maintenance | During this time each service hosted by CILogon including COmanage Registry, LDAP, Grouper, SAML proxy, and MDQ will become unavailable for a short time. Each individual service outage will last less than 5 minutes. Services that will not be impacted include: * OIDC clients that do not query LDAP for resolving attributes * X.509 certificate issuance and certificate revocation lists * LIGO and GW-Astronomy services | help@cilogon.org | COMPLETE |
2021-07-29 1300 | 2021-07-29 1400 | IRST-run bastion hosts (pool B) | Security patching | Hosts managed by IRST will be patched and rebooted. Only hosts in pool B will be patched at this time | help+security@ncsa.illinois.edu | COMPLETE |
2021-07-29 0900 | 2021-07-29 1000 | IRST-run bastion hosts (pool A) | Security patching | Hosts managed by IRST will be patched and rebooted. Only hosts in pool A will be patched at this time | help+security@ncsa.illinois.edu | COMPLETE |
2021-07-28 1000 | 2021-07-28 1050 | LSST | OS Updates on only NCSA Test Stand (NTS) | Only the LSST NCSA Test Stand (NTS) services hosted at NCSA | lsst-admin@ncsa.illinois.edu | COMPLETE |
2021-07-27 0600 | 2021-07-27 0900 | Jira | Upgrade | Jira will be unavailable | COMPLETE | |
2021-07-26 1800 | 2021-07-27 0000 | NCSA File & Print Servers | Scheduled Windows Server Maintenance | File & Print Shares were unavailable during maintenance. Users were not able to access shares on Fileserver (e.g. home, busnoff, hr, etc.), and printing was unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2021-07-21 0800 | 2021-07-21 2900 | ICCP | ICCP Quarterly Maintenance
| All ICCP services | help@campuscluster.illinois.edu | COMPLETE |
2021-07-21 15:24 | 2021-07-21 21:50 | ASD Vshpere cluster in 3003 | One of the 4 hypervisors in the cluster paniced. Unscheduled preventative maintenance is being preformed on it and the other 3 nodes in the cluster. | after the initial outage at 15:24, there should be no additional outages. | help+service@ncsa.illinois.edu | COMPLETE |
2021-07-13 0700 | 2021-07-13 0800 | cilogon.org | Update to OA4MP v5.1.4. | The OAuth2/OIDC backend of the CILogon Service will be updated to OA4MP v5.1.4. | help@cilogon.org | COMPLETE |
2021-07-08 0800 | 2121-07-08 1000 | OpenAFS | The remaining OpenAFS database servers were upgraded. | No service impacts were seen | help+service@ncsa.illinois.efu | COMPLETE |
2021-07-07 0600 | 2021-07-07 0800 | CILogon AWS Hosted Services | Upgrading AWS RDS Aurora MySQL v5.6 to v5.7 | COmanage Registry and Grouper services hosted by CILogon will be unavailable | help@cilogon.org | COMPLETE |
2021-07-01 2140 | 2021-07-01 1430 | Horizon dashboard access was down for the entire period. Cluster networking was down from 1200 to1430. | Investigations into Horizon dashboard accessibility issues resulted in the application of an incorrect default network gateway for the cluster around noon. This was corrected and networking functionality restored around 1400. Instances began recovering soon thereafter. | Radiant admins believe running instances have recovered on their own but we advise everyone to check their systems and report any issues they see to the help desk. | help@ncsa.illinois.edu | RESOLVED |
2021-07-01 0247 | 2021-07-01 1300 | Various systems in NPCF, ACB, NCSA | There was a power event in the Champaign-Urbana area at around 2:47AM today. Details about the cause are currently unknown. This event caused disruptions to systems at the NCSA building, NPCF and ACB. Known issues have generally been resolved but there may be unidentified issues lingering. If you encounter any problems, please notify NCSA help desk staff (help@ncsa.illinois.edu). | Multiple systems/services were impacted. All have been recovered and return to normal operations is complete. | NCSA help desk | RESOLVED |
?