status.ncsa.illinois.edu
Watch this page in the wiki to subscribe to automatic updates to this status page.
Please do not refer to any NCSA Industry Partners on this page. Please use the iforge nomenclature for all of the *forge infrastructure.
To see older events, see Archive of NCSA Status Home
Report a problem
Current Status
Start | End | What System/Service is affected | What is happening? | What will be affected? | Contact Person | Status |
---|---|---|---|---|---|---|
Upcoming Scheduled Maintenance
Listed below in chronological order.
Start | End | What System/Service is affected | What is happening? | What will be affected? | Contact Person | Status |
---|---|---|---|---|---|---|
2022-03-01 1800 | 2022-02-28 1900 | ldap2 server clients of NCSA LDAP | on-line maintenance | slow response from ldap1 but clients should have redundant servers configured | Timothy Bouvet | SCHEDULED |
2022-02-28 1800 | 2022-02-28 1900 | ldap1 server clients of NCSA LDAP | on-line maintenance | slow response from ldap1 but clients should have redundant servers configured | Timothy Bouvet | SCHEDULED |
2022-03-02 930 | 2022-03-02 1930 | ICC | Emergency PM We are seeing some network issues on the cluster. In order to resolve these issues, we need to upgrade code on our infiniband infrastructure | ICCP filesystem will be offline. Most projects will be impacted. Special arrangements have been made with some to be able to operate to some degree during the outage. | help@campuscluster.illinois.edu | SCHEDULED |
2022-03-14 1800 | 2022-03-15 0700 | NCSA File & Print Servers | Scheduled Windows Server Maintenance | File & Print Shares will be unavailable during maintenance. Users will not able to access shares on Fileserver (e.g. home, busnoff, hr, etc.), and printing will be unavailable. | help+service@ncsa.illinois.edu | SCHEDULED |
2022-04-20 0800 | 2022-04-20 2000 | ICC | ICC Quarterly Maintenance | ICC Cluster nodes only | SCHEDULED | |
2022-07-20 0800 | 2022-07-20 2000 | ICC | ICC Quarterly Maintenance | All ICC services | SCHEDULED | |
2022-10-19 0800 | 2022-10-19 2000 | ICC | ICC Quarterly Maintenance | ICC Cluster nodes only | SCHEDULED |
Previous Outages or Maintenance
Start | End | What System/Service was affected? | What happened? | What was affected? | Contact Person | Status |
---|---|---|---|---|---|---|
2022-02-28 1800 | 2022-02-28 1830 | ldap1 server clients of NCSA LDAP | on-line maintenance Had to restart rsyslog and Ldap after relocating /var/log | slow response from ldap1 but clients should have redundant servers configured | Timothy Bouvet | COMPLETE |
2022-02-28 0900 | 2022-02-28 1030 | CMDB | V1.7.20220228 Release | MDB database will be unavailable. ITSM's openDCIM will be down for a short period (~ 5 minutes) while the data is reloaded. | COMPLETE | |
2022-02-26 0730 | 2022-02-26 0750 | NCSA GitLab | GitLab was updated to latest version | All GitLab services were unavailable | help+service@ncsa.illinois.edu | COMPLETE |
2022-02-25-10:00 | 2022-02-25-13:00 | Taiga - CenterWide FS | Full file system outage | All clients mounting Taiga | COMPLETE | |
2022-02-09 1400 | 2022-02-25 1030 | Jira, Internal/Savannah, LDAP, POP, Hosted web servers, virtual classroom, vcenter | The NCSA VMWare cluster is experiencing storage performance issues. -- Update: Adjustments have been made to storage used by the LDAP servers and other non-essential VM instances have been disabled. Testing is indicating that response times have improved and services are working normally again. | We monitoring services. Please report any issues to help@ncsa.illinois.edu | Timothy Bouvet | RESOLVED FOR NOW |
2022-02-24 1000 | 2022-02-24 1115 | cerberus2.ncsa.illinois.edu, tg-kdc1.security.ncsa.illinois.edu, bwbh2.ncsa.illinois.edu | One of the IRST ESXi machines unexpectedly shutdown. | The listed hosts are currently unavailable | COMPLETE | |
2022-02-23 1700 | 2022-02-23 1900 | DNS2 | DNS2 hardware will be replaced. | There will be a brief outage of DNS2, while IP's are migrated to the new server. | help+neteng@ncsa.illinois.edu | COMPLETE |
2022-02-22: 0825 | 2022-02-22: 1324 | Slack | Info from Slack (https://status.slack.com/) We've resolved the issue, and all impacted customers should now be able to access Slack. You may need to reload Slack (Cmd/Ctrl + Shift + R) to see the fix on your end. If that doesn't work, try clearing cache (Help > Troubleshooting > Clear Cache and Restart from the app menu). Thanks for bearing with us and we apologize for the disruption to your work day! Feb 22, 1:24 PM CST We're seeing signs of improvement. Please try reloading Slack, and if not a cache reset. We’re still monitoring the situation. We’ll confirm once this issue is fully resolved. Feb 22, 11:07 AM CST Slack is not loading for some users. We are continuing to investigate the cause and will provide more information as soon as it's available. Feb 22, 9:23 AM CST We're still working towards a full resolution. We'll be back with another update soon. Thank you for your patience. Feb 22, 8:44 AM CST We’re investigating the issue where Slack is not loading for some users. We’re looking into the cause and will provide more information as soon as it's available. Feb 22, 8:25 AM CST | Various issues accessing and using Slack | help@ncsa.illinois.edu | COMPLETE |
2022-02-18 12:10PM | 2022-02-18 | Jira | Reboot to add ram/swap This is to improve stability | Jira tickets unavailable | Timothy Bouvet | COMPLETE |
2022-02-10 1030 | 2022-02-18 3:55pm | Ngale filesystem | The Lustre filesystem is not loading correctly. The support team has been contacted. Near completion: Working with vendor on additional configuration changes. Hope to complete final validation and return to service by close of business 2022-02-18. | /ngale filesystem is not accessible. | Peter Hartman | COMPLETE |
2022-02-18 12:10PM | 2022-02-18 | Jira | Reboot to add ram/swap This is to improve stability | Jira tickets unavailable | Timothy Bouvet | COMPLETE |
2022-02-14 1PM | 2022-02-14 4:15PM | All NCSA LDAP servers | Expanding schema and restarting servers | systems will reconnect to LDAP server after restart | COMPLETE | |
2022-02-09 1000 | 2022-02-09 1200 | Facility UPS | UPS DC voltage calibration | UPS will be taken to maintenance bypass and all connected systems will be fed from unprotected power source (no power interruption). | rantissi@illinois.edu | COMPLETE |
2022-02-09 0900 | 2022-02-09 0940 | Line card failure in Core-East | Line card failure in Core-east, which is resulting in connectivity issues for some infrastructure in NCSA 3003. | DNS2, and LSST systems in 3003 were down until the uplinks could be migrated to a new port on Cores | help+neteng.ncsa.illinois.edu | COMPLETE |
2022-02-01 8AM | 2022-02-01 4PM | Jira/ldap-auth1 | login issues | Jira Access | ||
2022-02-09 0534 | 2022-02-09 0811 | LDAP (and dependent services, incl. Jira) vSphere/ICI VMware | Authorization timeouts/failures in dependent services. ICI staff are investigating. | LDAP (and dependent services, incl. Jira) vSphere/ICI VMware Cause of most severe issues was power fluctuations around 0555, but certain LDAP servers showed degraded slightly earlier. | COMPLETE | |
2022-02-09 0600 | 2022-02-09 0645 | NCSA MySQL | MySQL database servers need to be synchronized to bring replicated database servers online. NOTE: The MySQL database is back up, but users may experience issues due to an LDAP issue. | Wiki, JIRA, Savannah/Internal, Identity, and some web sites will stop working. More details are linked here. | help+service@ncsa.illinois.edu | COMPLETE |
2022-02-08 7AM | 22-02-08 3:15PM | iforge / vforge / license servers | Regular Maintenance | iforge, vforge, license servers | COMPLETE | |
2022-02-08 1000 | 2022-02-08 1245 | CMDB | V1.6.20220207 Release | CMDB database will be unavailable. ITSM's openDCIM will not be impacted. | kimber7@illinois.edu | COMPLETE |
2022-02-04 0600 | 2022-02-04 0640 | NCSA GitLab | GitLab was updated to latest version | All GitLab services were unavailable | help+service@ncsa.illinois.edu | COMPLETE |
2022-02-01 0800 | 2022-02-01 0900 | cilogon.org | Update to OA4MP v5.2.4 | Improvements in the back-end service | help@cilogon.org | COMPLETE |
2022-01-25 | 2022-01-25 | Facility UPS | Replace UPS batteries | All systems with facility UPS feed | rantissi@illinois.edu | COMPLETE |
2022-01-24 1800 | 2022-01-24 20:00 | NCSA File & Print Servers | Scheduled Windows Server Maintenance | File & Print Shares will be unavailable during maintenance. Users will not able to access shares on Fileserver (e.g. home, busnoff, hr, etc.), and printing will be unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2022-01-24 0400 | 2022-01-24 0630 | Failed line card on neo-hpc-1 switch | Line card failure is affecting devices that are plugged into Neo-hpc-1 aggregation switch. We've migrated links off the failed card, to other ports on the same switch. | No services are currently impacted. | help+neteng@ncsa.illinois.edu | IN PROGRESS |
2022-01-19 0800 | 2022-01-19 2000 | ICC | ICC Quarterly Maintenance | All ICC services | COMPLETE | |
2022-01-18 0800 | 2022-01-18 0830 | cilogon.org | Upgrade MyProxy CA servers to CentOS 7 | Upgrade back-end MyProxy CA VMs from CentOS 6 to CentOS 7. No downtime is expected. | help@cilogon.org | COMPLETE |
2022-01-14 0600 | 2022-01-14 1715 | Business IT database had bad data. | A database that NCSA mirrors from campus changed without notice breaking our MIS system. Business IT isolated the issue and corrected the data. | Multiple complex systems have been affected by this data corruption issue. | help+service@ncsa.illinois.edu | RESOLVED |
2022-01-14 0800 | 2022-01-14 1720 | NCSAnet wireless | NCSAnet Wireless was unavailable due to bad data in ldap | Users couldn't connect to the NCSAnet wireless network | help+neteng@ncsa.illinois.edu | RESOLVED |
2022-01-05 1100 | 2022-01-05 1145 | CMDB | Version V1.5.20211223 release | CMDB database will be unavailable for a few moments; openDCIM will be unavailable for a few moments. | kimber7@illinois.edu | COMPLETE |
2021-12-20 1830 | 2021-12-20 2030 | JIra | Version Upgrade to address security issue | Jira will be unavailable | help+service@ncsa.illinois.edu | COMPLETE |
2021-12-17 1300 | 2021-12-17 1340 | CMDB | Version V1.4.20211217 release | CMDB database will be unavailable for a few moments; openDCIM will not be affected. | COMPLETE | |
2021-12-17 0600 | 2021-12-17 0622 | NCSA GitLab | The server was updated with some new Puppet configurations. | GitLab services was unavailable for a few minutes as the SSL certificate for the service was updated. | help+service@ncsa.illinois.edu | COMPLETE |
2021-12-16 1400 | 2021-12-16 1430 | HTTP web proxy: httpproxy.ncsa.illinois.edu | NCSA's general purpose HTTP web proxy server was rebuilt. | HTTP web proxying through httpproxy was unavailable. | help+service@ncsa.illinois.edu | COMPLETE |
2021-12-10 0700 | 2021-12-10 1345 | iForge | InfiniBand switch maintenance | All systems unavailable | iforge-admin@lists.ncsa.illinois.edu | COMPLETE |
2021-12-10 0900 | 2021-12-10 1000 | Bastion Hosts (Production group B) | Patching out of cycle | Bastion Hosts (Production group B) were individually unavailable during reboot | help+security@ncsa.illinois.edu | COMPLETE |
2021-12-09 0900 | 2021-12-09 0931 | Bastion Hosts (Production group A) | Patching out of cycle | Bastion Hosts (Production group A) were individually unavailable during reboot | COMPLETE | |
2021-12-09 0800 | 2021-12-09 0900 | All IDDS services | IDDS Postgres and Ruby on Rails upgrades | All IDDS services | tolbert@illinois.edu | COMPLETE |
2021-12-09 0600 | 2021-12-09 0613 | NCSA GitLab | GitLab was updated to latest version | All GitLab services were unavailable for about 5 minutes | help+service@ncsa.illinois.edu | COMPLETE |
2021-12-07 1400 | 2021-12-07 1443 | LSST | Kubernetes on NTS is not working properly after updates | Kubernetes on NCSA Test Stand | lsst-admin@ncsa.illinois.edu | RESOLVED |
2021-12-07 0800 | 2021-12-07 1400 | LSST | LSST Quarterly Maintenance | All LSST services hosted at NCSA | lsst-admin@ncsa.illinois.edu | COMPLETE |
2021-12-07 0930 | 2021-12-07 1030 | ACHE Firewalls | software maintenance | Firewalls will be upgraded using fail over procedures - no traffic impact expected | James Eyrich - eyrich on slack | COMPLETE |
Legend:
IN PROGRESS
RESOLVED
SCHEDULED
MONITORING