status.ncsa.illinois.edu
Watch this page in the wiki to subscribe to automatic updates to this status page.
Please do not refer to any NCSA Industry Partners on this page. Please use the iforge nomenclature for all of the *forge infrastructure.
To see older events, see Archive of NCSA Status Home
Report a problem
Current Status
Start | End | What System/Service is affected | What is happening? | What will be affected? | Contact Person | Status |
---|---|---|---|---|---|---|
2022-06-02 1800 | NCSA Wiki Servcie | Due to an announced critical security vulnerability announced by Atlassian we have been forced to restrict access to the NCSA Wiki to NCSA internal networks. This restriction will remain in place until Atlassian is able to provide a patch or mitigation for the vulnerability. | No remote access is allowed to the NCSA Wiki. Use the NCSA VPN for remote access. More information about using the VPN can be found here: https://users.ncsa.illinois.edu/clausen/NCSA_VPN_instructions_202206.pdf | help@ncsa.illinois.edu | ACTIVE |
Upcoming Scheduled Maintenance
Listed below in chronological order.
Start | End | What System/Service is affected | What is happening? | What will be affected? | Contact Person | Status |
---|---|---|---|---|---|---|
2022-07-20 0800 | 2022-07-20 2000 | ICC | ICC Quarterly Maintenance | All ICC services | SCHEDULED | |
2022-10-19 0800 | 2022-10-19 2000 | ICC | ICC Quarterly Maintenance | ICC Cluster nodes only | SCHEDULED |
Previous Outages or Maintenance
Start | End | What System/Service was affected? | What happened? | What was affected? | Contact Person | Status |
---|---|---|---|---|---|---|
2022-06-02 0600 | 2022-06-02 0615 | NCSA GitLab | GitLab was updated to latest version | All GitLab services were unavailable for a few minutes. | help+service@ncsa.illinois.edu | COMPLETE |
1700 | 1900 | Jira | Upgrade | Jira will not be available | help+service@illinois.edu | COMPLETE |
2022-06-01 0900 | 2022-06-01 1015 | Facility UPS | Replace two batteries, | All system with UPS feed, the UPS will stay online supporting loads but at reduced capacity and no outage expected. | rantissi@illinois.edu | COMPLETE |
2022-05-25 2230 | 2022-05-26 16:15 | Delta | 3 HSN switches were experiencing problems switches were updated and reconfigured |
| help@ncsa.illinois.edu | COMPLETE |
2022-05-25 1800 | 2022-05-25 2230 | Taiga - CenterWide FS | Partial outage. Some projects asked to temporary unmount /taiga | delta | Christopher Heller | COMPLETE |
2022-05-18 0700 | 2022-05-18 1400 | Nightingale | Nightingale Planned Maintenance | All Nightingale Services | help@ncsa.illinois.edu | COMPLETE |
2022-05-12 1700 | 2022-05-12 1800 | Jira & Wiki | Change to puppet configs | Downtime expected on each system for 1 to 5 minutes | help+service@ncsa.illinois.edu | COMPLETE |
2022-05-10 0700 | 2022-05-10 1900 | iForge / vForge / license servers | Quarterly Planned Maintenance | all nodes and services will be unavailable | help@ncsa.illinois.edu | COMPLETE |
2022-05-10 0800 | 2022-05-10 0815 | cilogon.org | Update to OA4MP v5.2.6 | Improvements in the back-end service | help@cilogon.org | COMPLETE |
2022-05-09 1800 | 2022-05-09 2130 | NCSA File & Print Servers | Scheduled Windows Server Maintenance | File & Print Shares were unavailable during maintenance. Users were unable to access shares on Fileserver (e.g. home, busnoff, hr, etc.), and printing was unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2022-05-04 1000 | 2022-05-04 1015 | IDDS Accounting Services | Planned Maintenance | All IDDS services (APIs, acctd, etc) | help+idds@ncsa.illinois.edu, tolbert@illinois.edu | COMPLETE |
2022-05-04 0600 | 2022-05-04 0622 | NCSA GitLab | GitLab was updated to latest version | All GitLab services were unavailable for a few minutes. | help+service@ncsa.illinois.edu | COMPLETE |
2022-04-19 12:00 | 2022-04-19 12:01 | Radiant | Restarted the AMQP service to put in some performance changes | New instance or virtual network changes that were submitted during the five-second restart may have failed | radiant-admin@ncsa.illinois.edu | COMPLETE |
2022 04-16 0600 | 2022 04-16 0630 | CILogon | Several cilogon.org services will be updated | https://cilogon.org , https://crl.cilogon.org , https://demo.cilogon.org , ldaps://ldap.cilogon.org | help@cilogon.org | COMPLETE |
2022-04-14 2100 | 0915 | Jira | New tickets cannot be created due to the user license limit being reached | Creation of new tickets. | https://www.ncsa.illinois.edu/expertise/user-services/user-support/ | RESOLVED |
2022 04-14 0800 | 2022 04-14 0830 | Wifi, VoIP, CCTV and FS networks at NCSA. | Tech services will be replacing their building router at NCSA. They expect a 10 mins outage. | Services may see a temporary interruption as cables are being changed. | help+neteng@ncsa.illinois.edu | SCHEDULED |
2022 04-09 0600 | 2022 04-09 0700 | Internet2 / ESnet WAN connections. | During a few minute outage, some of our WAN circuits will be migrated. Traffic will be automatically re-routed. | help+neteng@ncsa.illinois.edu | SCHEDULED | |
2022-03-17 0900 | 2022-04-12 1030 | jira | ldap auths have been sporadically failing. This service is being monitored to determine a root cause. | Jira logins break | help+service@ncsa.illinois.edu | RESOLVED |
2022-04-12 0900 | 2022-04-12 0930 | vsphere.ncsa.illinois.edu | vcenter security updates are being installed | vm management interface will be unavailable for 15 mins. | help@ncsa.illinois.edu | COMPLETE |
2022-04-07 1900 | 2022-04-07 1950 | NCSA VPN | Software Upgrades / SSL Certificate | The appliances hosting the NCSA VPN were patched and receive an updated SSL certificate. Users will experience a brief disconnect as load is failed over between the appliances. | neteng@ncsa.illinois.edu | RESOLVED |
2022-04-06 2200 | 2022-04-07 0000 | Some office ports on the second floor. | Once of the switches on the second floor is experiencing a software problem and is currently down. Code updates are being applied. | One of the six switches on the second floor is down. Users who are connected to this port, might not receive link. | help+neteng@ncsa.illinois.edu | RESOLVED |
2022-04-06 1530 | 2022-04-07 0630 | All systems which mount/utilize Taiga | A bug involving the multirail functionality caused constant reboots with one of the metadata servers. This resulted in cluster de-stabilization and loss of function. | All lustre/NFS mountpoints to Taiga, Globus to Taiga. | help@ncsa.illinois.edu | RESOLVED |
2022-04-04 0930 | 2022-04-04 1000 | NCSA LDAP | Instantiation of Delta resource OU branch in the NCSA LDAP database with replication testing. | No impacts to properly configured systems or searches is expected. | help@ncsa.illinois.edu | COMPLETE |
2022-04-01 0600 | 2022-04-01 0700 | NCSA GitLab | GitLab was updated to latest version | All GitLab services was unavailable for a few minutes. | help+service@ncsa.illinois.edu | COMPLETE |
2022-03-23 1000 | 2022-03-23 1600 | Email Lists | Email lists (lists.ncsa.illinois.edu) are not functioning | Ability to send to email lists. Note: Bounced emails will need to be resent. | help+service@ncsa.illinois.edu | COMPLETE |
2022-03-22 0730hrs | 2022-03-22 0915hrs | ldap - NCSA primary server | OS updates and replication changes | NCSA LDAP primary server will be unavailable, replicas should remain accessible | Timothy Bouvet | COMPLETE |
2022-03-21 0800 | 2022-03-21 0830 | cilogon.org | Migrate CILogon Services to AWS | cilogon.org , demo.cilogon.org , crl.cilogon.org | help@cilogon.org | COMPLETE |
2022-03-19 0100 | 2022-03-19 1500 | Campus Cluster | Cooling units at ACB stopped functioning, temperatures in the datacenter soared to cause machines to power off due to high temps. By the time ICI was informed, cooling had resumed at ACB. ICI then restored service | All of Campus Cluster | help@campuscluster.illinois.edu | RESOLVED |
2022-03-17 1100 | 2022-03-17 1123 | ASD and ACHE vsphere clusters and ldap1 and ldap2 | certs on ldap1 and ldap2 were updated | logins to ASD and ACHE vsphere were down for 23 minutes. | help@ncsa.illinois.edu | COMPLETE |
2022-03-17 | 2022-03-17 10:01 | Jira | Logins are slow or unsuccessful | Jira login | RESOLVED | |
2022-03-16 1700 | 2022-03-16 1800 | DNS1 | Hardware replacement on DNS1 server. | DNS lookups will be on own the primary DNS server while the hardware is being swapped. DNS2 will remain up. | help+neteng@ncsa.illinois.edu | COMPLETE |
2022-03-14 1800 | 2022-03-15 23:45 | NCSA File & Print Servers | Scheduled Windows Server Maintenance | File & Print Shares were unavailable during maintenance. Users were able to access shares on Fileserver (e.g. home, busnoff, hr, etc.), and printing was unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2022-03-10 0700hrs | 2022-03-10 1500hrs | Distribution panel DP-5C-1020. Power feed C to the north east corner power panels | De-energizing electrical distribution panel DP-5C-1020 to tie in power cables to Holl-I system | Known resources impacted: Granite: already planned to be offline for maintenance iForge: cluster offline for the duration Radiant: cluster online, without power redundancy | help@ncsa.illinois.edu | COMPLETE |
2022-03-09 0700 | 2022-03-09 0810 | linux.ncsa.illinois.edu (aka public-linux) | Upgrade server to RHEL 8 and add NCSA Duo 2FA authentication | Server was unavailable during maintenance. | help+service@ncsa.illinois.edu | COMPLETE |
2022-03-02 930 | 2022-03-07 1715 | ICC | Emergency PM UPDATE: We are currently experiencing unforeseen technical issues with the cluster. We are investigating and expect resolution and restoration of all Campus Cluster services by March 3rd 12PM | ICCP filesystem will be offline. Most projects will be impacted. Special arrangements have been made with some to be able to operate to some degree during the outage. | help@campuscluster.illinois.edu | COMPLETE |
2022-03-02 1237 | 2022-03-02 1715 | iforge (iforge.ncsa.illinois.edu | GPFS issue with interruption of filesystem leading to scheduler pause | 1 running job was aborted, and any new jobs paused during the interruption | help@ncsa.illinois.edu | COMPLETE |
2022-03-02 0600 | 2022-03-02 0630 | Jira | Adding Ram | Jira will be unavailable druning maintenance | COMPLETE | |
2022-03-01 1800 | 2022-03-01 1810 | ldap2 server clients of NCSA LDAP | on-line maintenance | restart rsyslog and Ldap after relocating /var/logs clients should have redundant servers configured | Timothy Bouvet | COMPLETE |
2022-02-28 1800 | 2022-02-28 1830 | ldap1 server clients of NCSA LDAP | on-line maintenance Had to restart rsyslog and Ldap after relocating /var/log | slow response from ldap1 but clients should have redundant servers configured | Timothy Bouvet | COMPLETE |
2022-02-28 0900 | 2022-02-28 1030 | CMDB | V1.7.20220228 Release | MDB database will be unavailable. ITSM's openDCIM will be down for a short period (~ 5 minutes) while the data is reloaded. | COMPLETE | |
2022-02-26 0730 | 2022-02-26 0750 | NCSA GitLab | GitLab was updated to latest version | All GitLab services were unavailable | help+service@ncsa.illinois.edu | COMPLETE |
2022-02-25-10:00 | 2022-02-25-13:00 | Taiga - CenterWide FS | Full file system outage | All clients mounting Taiga | COMPLETE | |
2022-02-09 1400 | 2022-02-25 1030 | Jira, Internal/Savannah, LDAP, POP, Hosted web servers, virtual classroom, vcenter | The NCSA VMWare cluster is experiencing storage performance issues. -- Update: Adjustments have been made to storage used by the LDAP servers and other non-essential VM instances have been disabled. Testing is indicating that response times have improved and services are working normally again. | We monitoring services. Please report any issues to help@ncsa.illinois.edu | Timothy Bouvet | RESOLVED FOR NOW |
2022-02-24 1000 | 2022-02-24 1115 | cerberus2.ncsa.illinois.edu, tg-kdc1.security.ncsa.illinois.edu, bwbh2.ncsa.illinois.edu | One of the IRST ESXi machines unexpectedly shutdown. | The listed hosts are currently unavailable | COMPLETE | |
2022-02-23 1700 | 2022-02-23 1900 | DNS2 | DNS2 hardware will be replaced. | There will be a brief outage of DNS2, while IP's are migrated to the new server. | help+neteng@ncsa.illinois.edu | COMPLETE |
2022-02-22: 0825 | 2022-02-22: 1324 | Slack | Info from Slack (https://status.slack.com/) We've resolved the issue, and all impacted customers should now be able to access Slack. You may need to reload Slack (Cmd/Ctrl + Shift + R) to see the fix on your end. If that doesn't work, try clearing cache (Help > Troubleshooting > Clear Cache and Restart from the app menu). Thanks for bearing with us and we apologize for the disruption to your work day! Feb 22, 1:24 PM CST We're seeing signs of improvement. Please try reloading Slack, and if not a cache reset. We’re still monitoring the situation. We’ll confirm once this issue is fully resolved. Feb 22, 11:07 AM CST Slack is not loading for some users. We are continuing to investigate the cause and will provide more information as soon as it's available. Feb 22, 9:23 AM CST We're still working towards a full resolution. We'll be back with another update soon. Thank you for your patience. Feb 22, 8:44 AM CST We’re investigating the issue where Slack is not loading for some users. We’re looking into the cause and will provide more information as soon as it's available. Feb 22, 8:25 AM CST | Various issues accessing and using Slack | help@ncsa.illinois.edu | COMPLETE |
2022-02-18 12:10PM | 2022-02-18 | Jira | Reboot to add ram/swap This is to improve stability | Jira tickets unavailable | Timothy Bouvet | COMPLETE |
2022-02-10 1030 | 2022-02-18 3:55pm | Ngale filesystem | The Lustre filesystem is not loading correctly. The support team has been contacted. Near completion: Working with vendor on additional configuration changes. Hope to complete final validation and return to service by close of business 2022-02-18. | /ngale filesystem is not accessible. | Peter Hartman | COMPLETE |
2022-02-18 12:10PM | 2022-02-18 | Jira | Reboot to add ram/swap This is to improve stability | Jira tickets unavailable | Timothy Bouvet | COMPLETE |
2022-02-14 1PM | 2022-02-14 4:15PM | All NCSA LDAP servers | Expanding schema and restarting servers | systems will reconnect to LDAP server after restart | COMPLETE | |
2022-02-09 1000 | 2022-02-09 1200 | Facility UPS | UPS DC voltage calibration | UPS will be taken to maintenance bypass and all connected systems will be fed from unprotected power source (no power interruption). | rantissi@illinois.edu | COMPLETE |
2022-02-09 0900 | 2022-02-09 0940 | Line card failure in Core-East | Line card failure in Core-east, which is resulting in connectivity issues for some infrastructure in NCSA 3003. | DNS2, and LSST systems in 3003 were down until the uplinks could be migrated to a new port on Cores | help+neteng.ncsa.illinois.edu | COMPLETE |
2022-02-01 8AM | 2022-02-01 4PM | Jira/ldap-auth1 | login issues | Jira Access | ||
2022-02-09 0534 | 2022-02-09 0811 | LDAP (and dependent services, incl. Jira) vSphere/ICI VMware | Authorization timeouts/failures in dependent services. ICI staff are investigating. | LDAP (and dependent services, incl. Jira) vSphere/ICI VMware Cause of most severe issues was power fluctuations around 0555, but certain LDAP servers showed degraded slightly earlier. | COMPLETE | |
2022-02-09 0600 | 2022-02-09 0645 | NCSA MySQL | MySQL database servers need to be synchronized to bring replicated database servers online. NOTE: The MySQL database is back up, but users may experience issues due to an LDAP issue. | Wiki, JIRA, Savannah/Internal, Identity, and some web sites will stop working. More details are linked here. | help+service@ncsa.illinois.edu | COMPLETE |
2022-02-08 7AM | 22-02-08 3:15PM | iforge / vforge / license servers | Regular Maintenance | iforge, vforge, license servers | COMPLETE | |
2022-02-08 1000 | 2022-02-08 1245 | CMDB | V1.6.20220207 Release | CMDB database will be unavailable. ITSM's openDCIM will not be impacted. | kimber7@illinois.edu | COMPLETE |
2022-02-04 0600 | 2022-02-04 0640 | NCSA GitLab | GitLab was updated to latest version | All GitLab services were unavailable | help+service@ncsa.illinois.edu | COMPLETE |
2022-02-01 0800 | 2022-02-01 0900 | cilogon.org | Update to OA4MP v5.2.4 | Improvements in the back-end service | help@cilogon.org | COMPLETE |
2022-01-25 | 2022-01-25 | Facility UPS | Replace UPS batteries | All systems with facility UPS feed | rantissi@illinois.edu | COMPLETE |
2022-01-24 1800 | 2022-01-24 20:00 | NCSA File & Print Servers | Scheduled Windows Server Maintenance | File & Print Shares will be unavailable during maintenance. Users will not able to access shares on Fileserver (e.g. home, busnoff, hr, etc.), and printing will be unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2022-01-24 0400 | 2022-01-24 0630 | Failed line card on neo-hpc-1 switch | Line card failure is affecting devices that are plugged into Neo-hpc-1 aggregation switch. We've migrated links off the failed card, to other ports on the same switch. | No services are currently impacted. | help+neteng@ncsa.illinois.edu | IN PROGRESS |
2022-01-19 0800 | 2022-01-19 2000 | ICC | ICC Quarterly Maintenance | All ICC services | COMPLETE | |
2022-01-18 0800 | 2022-01-18 0830 | cilogon.org | Upgrade MyProxy CA servers to CentOS 7 | Upgrade back-end MyProxy CA VMs from CentOS 6 to CentOS 7. No downtime is expected. | help@cilogon.org | COMPLETE |
2022-01-14 0600 | 2022-01-14 1715 | Business IT database had bad data. | A database that NCSA mirrors from campus changed without notice breaking our MIS system. Business IT isolated the issue and corrected the data. | Multiple complex systems have been affected by this data corruption issue. | help+service@ncsa.illinois.edu | RESOLVED |
2022-01-14 0800 | 2022-01-14 1720 | NCSAnet wireless | NCSAnet Wireless was unavailable due to bad data in ldap | Users couldn't connect to the NCSAnet wireless network | help+neteng@ncsa.illinois.edu | RESOLVED |
2022-01-05 1100 | 2022-01-05 1145 | CMDB | Version V1.5.20211223 release | CMDB database will be unavailable for a few moments; openDCIM will be unavailable for a few moments. | kimber7@illinois.edu | COMPLETE |
2021-12-20 1830 | 2021-12-20 2030 | JIra | Version Upgrade to address security issue | Jira will be unavailable | help+service@ncsa.illinois.edu | COMPLETE |
2021-12-17 1300 | 2021-12-17 1340 | CMDB | Version V1.4.20211217 release | CMDB database will be unavailable for a few moments; openDCIM will not be affected. | COMPLETE | |
2021-12-17 0600 | 2021-12-17 0622 | NCSA GitLab | The server was updated with some new Puppet configurations. | GitLab services was unavailable for a few minutes as the SSL certificate for the service was updated. | help+service@ncsa.illinois.edu | COMPLETE |
2021-12-16 1400 | 2021-12-16 1430 | HTTP web proxy: httpproxy.ncsa.illinois.edu | NCSA's general purpose HTTP web proxy server was rebuilt. | HTTP web proxying through httpproxy was unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2021-12-10 0700 | 2021-12-10 1345 | iForge | InfiniBand switch maintenance | All systems unavailable | iforge-admin@lists.ncsa.illinois.edu | COMPLETE |
2021-12-10 0900 | 2021-12-10 1000 | Bastion Hosts (Production group B) | Patching out of cycle | Bastion Hosts (Production group B) were individually unavailable during reboot | help+security@ncsa.illinois.edu | COMPLETE |
2021-12-09 0900 | 2021-12-09 0931 | Bastion Hosts (Production group A) | Patching out of cycle | Bastion Hosts (Production group A) were individually unavailable during reboot | COMPLETE | |
2021-12-09 0800 | 2021-12-09 0900 | All IDDS services | IDDS Postgres and Ruby on Rails upgrades | All IDDS services | tolbert@illinois.edu | COMPLETE |
2021-12-09 0600 | 2021-12-09 0613 | NCSA GitLab | GitLab was updated to latest version | All GitLab services were unavailable for about 5 minutes | help+service@ncsa.illinois.edu | COMPLETE |
2021-12-07 1400 | 2021-12-07 1443 | LSST | Kubernetes on NTS is not working properly after updates | Kubernetes on NCSA Test Stand | lsst-admin@ncsa.illinois.edu | RESOLVED |
2021-12-07 0800 | 2021-12-07 1400 | LSST | LSST Quarterly Maintenance | All LSST services hosted at NCSA | lsst-admin@ncsa.illinois.edu | COMPLETE |
2021-12-07 0930 | 2021-12-07 1030 | ACHE Firewalls | software maintenance | Firewalls will be upgraded using fail over procedures - no traffic impact expected | James Eyrich - eyrich on slack | COMPLETE |
Legend:
IN PROGRESS
RESOLVED
SCHEDULED
MONITORING