Child pages
  • IT Services Status
Skip to end of metadata
Go to start of metadata

This page is now deprecated.  For IT Services status updates, please watch the NCSA Status Home page.


Report a problem

Historical Outages

StartEndWhat happened?What was affected?Outcome
8/9/2017 17:00 CST8/9/2017 20:00 CSTWindows Servers were patched with the latest Windows Updates, "Require LDAP Signing" GPO was enabled, and LdapEnforceChannelBinding registry entries were created on domain controllers to address CVE-2017-8563 security issue.Domain Controllers, Print Servers, File Servers may have been accessible during this time.Servers are patched and running the proper security fixes, functioning as normal.


14:00 CST


15:45 CST

Crashplan was upgraded to 5.4.1 to address critical bug in 5.4Crashplan servers were momentarily unavailable as the service restarted.Crashplan will push client updates out to clients automatically with the critical bug fix.
12/2/2016 07:00 CST12/2/2016 07:14 CSTNCSA Jabber was upgraded to latest was unavailable for a few minutes while it was upgraded.Jabber is now running latest version and OS has been upgraded.

12/1/2016 9:00 AM CST

12/1/2016 10:20 AM CST

Crashplan was upgraded to 5.4 to address security issuesCrashplan servers were momentarily unavailable as the service restarted.Crashplan will push client updates out to clients automatically with the security fixes.

11/30/2016 4:30PM CST

11/30/2016 8:30PM

RSA Authentication Manager was upgraded to 8.2p3RSA Services failed over and OTP self-service site was unavailable during the momentary restartSecurity vulnerabilites in RSA Authentication Manager have been patched.
11/08/2016 03:00 CDT11/08/2016 10:30 CDTOne of our AFS file servers (papyrus) ran out of disk space.Many websites were unavailable till free space was restored in AFS.
While most everything was returned to service by 07:30, two websites ( and were offline until 10:25.
We restored some free space on the AFS server and all websites came back online.

10/20/2016 11:30am

10/20/16 11:55am

Fileserver is experiencing very slow speeds, and at times no connectivity at all.Users attempting to connect to and use files on fileserver.ncsa.illinois.eduRebooting seems to have fixed the issue.

10/6/16 ?????

10/6/16 4:30 PM

A certificate in the root keystone expired and jabber stopped was unavailable for new connectionsSystem and SSL cert stores were upgraded and system was restarted.
10/04/16 10:42 CDT10/04/16 10:48 CDTThe NCSA MySQL service restarted.Many websites another services such as NCSA Wiki, JIRA, Bluewaters Portal, Jabber, etc were unavailable while the service restarted.MySQL is operating normally and services that use MySQL are working as expected.
9/23/169/23/16RSA SecurID infrastructure was upgraded to include latest security fixes. was down for about 30 minutes during the update as the system rebuilt the applianceRSA is once again secure.
8/29/2016 18:59 CDT8/30/2016 11:45 CDTThe SSL certificate for expired.Most authentication via NCSA LDAP failed. This included popular services such as NCSA Wiki, JIRA, Bluewaters Portal, and Internal website.A new certificate was installed and NCSA LDAP authentication via SSL is now working.
8/24/2016 14:34 CDT8/24/2016 18:30 CDTA host routing misconfiguration issue was triggered on one of the AFS file servers during routine disk maintenance. It took longer than expected to diagnose the issue.Websites,, and were unavailable during the outage.The websites were brought back online without issues. Two of the web servers were also upgraded with latest packages, kernels, etc.

8/18/2016 11:30 CDT

8/18/2016 13:00 CDT

Fileserver experienced some connectivity issues.Users attempting to connect to were unable to do so. (Including HR and Business Office)Fileserver was rebooted and all shares except the business office were brought back online. Extra steps were taken to get the business office share back online.
7/28/2016 06:00 CDT7/28/2016 15:00 CDTThe NCSA POP server's disk filled up over night.Email to '' addresses were delayed.A new disk was added to the NCSA pop server and queued email resumed delivery.
7/19/2016 12:00 CDT7/19/2016 12:02 CDTNCSA's Confluence Wiki was was offline for approximately 2 minutes.The confluence service was restarted and is now running normally.
6/30/2016 10:00 CDT6/30/2016 10:07 CDTNCSA's Confluence Wiki software was offline for approximately 7 minutes.The confluence service was restarted and is now running normally.

6/29/16 11:15 CDT

6/29/16 14:00 CDTCrashplan softtware was upgraded to version 5.2.1Backups were unavailable for a few minutes while servers were upgraded and clients had software updates pushed to them.

Fixes a critical data loss bug in previous version. Client version is now 4.6.1.


6/23/20166/24/2016storage behind ITS sphere was upgradedThere should not have been service impact.Equal Logic PS arrays now running version 9.0.0
6/15/2016 02:36 CDT6/15/2016 04:45 CDTNCSA's MySQL server ran out of disk space.Most services that use MySQL could not write data, causing outages. This included BlueWaters portal, wiki, JIRA, jabber, RT, and many other services.Disk was freed up and things are running normally.




12:45 CDT

Crashplan was upgraded to latest releaseBackups were unavailable for a few minutes as servers restarted to upgrade their softwaresecurity related issue was mitigated.
5/31/2016 10:00 CDT5/31/2016 12:00 CDTNCSA's JIRA was upgraded to a fresh OS and JIRA version was offline intermittently during the upgrade.JIRA is upgraded and running normally.
5/4/16 17:00 CDT5/4/16 19:20 CDTUpgraded fileserver to Windows Server 2012 R2Fileserver was down for data moveThe new server is in place, fileserver should be online.
4/18/2016 08:00 CDT4/18/2016 11:15 CDTThe ITS VM Farm networking switch stack firmware needs to be upgraded and the switches rebooted to recover from a networking issue.Most ITS servers and services were offline, including: websites, VMs, crashplan, OTP portal, printers, fileserver. BlueWaters portal, ldap, and license servers and XSEDE primary kerberos servers were offline briefly. 3 websites were down till approximately 1:30pm.All ITS servers and services are now online.
17:00:00 CST
17:00:15 CST 
LDAP service was restarted to add a new plugin (MemberOf)Authorization services and pass-through to Kerberos were down for fifteen seconds.memberOf queries now work.
2/16/2016 13:50 CST2/17/2016 14:32 CSTNCSA Jabber service failed.Jabber users could not authenticate, due to the LDAP outage.Jabber's security settings weren't working with the new LDAP service. We ended up upgrading Jabber to it's latest version.
2/16/2016 13:50 CST2/17/2016 11:15 CSTNCSA's LDAP service failed.Many services including RT, Jira, Wiki, Internal, Jabber, BlueWaters portal, were not able to authenticate.ITS ended up setting up a brand new LDAP server to replace the old one.
1/18/2016 10:00 CST1/18/2016 10:53 CSTThe ITS VM Farm networking switch stack was rebooted to recover from an unexpected networking issue preventing new routing tables from being created.Most ITS servers and services will be offline, including: kerberos, ldap, email, RT, Jira, wiki, AFS, websites, mysql, VMs, crashplan, OTP portal, jabber, printers, fileserver. BlueWaters portal, ldap, and license servers will be offline. XSEDE primary kerberos servers will be offline.While there were various, intermittent outages during the maintenance, all of these servers and services were unavailable during the switch reboot from 10:43-10:48 CST. All services were back online by 10:53 CST.
12/17/201512/17/2015Crashplan patched with the latest maintenance release. 5.0.1 to 5.0.2Clients restarted their backup service after the server's push the client update out.Upgraded Crashplan service to 5.0.2
11/14/2015 08:25 CST11/14/2015 12:44 CSTThe JIRA server,, was restarted unexpectedly.Due to a VM issue it took a while to bring the server back online.The JIRA server is back online after repairing it's VM.
11/4/2015 06:00 CST11/4/2015 08:50 CSTThe AFS client on CentOS 7 servers broke after a kernel upgrade.AFS and web services on the following servers broke: public-linux, cybergis, security, farmdoc, midwestbigdatahub, acipartnership, artcaonline, brainstormhpcd, & gecat.The kernels on these systems were downgraded and the OpenAFS client was reinstalled for that kernel.
9/23/2015 11:06 CDT9/23/2015 11:35 CDTvSphere vCenter Server was upgraded to 6.0U1.Web interface for NCSA's vSphere was unavailable during the upgrade.Now running vSphere vCenter Server Appliance version 6.0U1.
9/19/2015 02:54 CDT9/19/2015 10:40 CDTThe NCSA building experience a major power outage, likely due to weather and lighting.All networking and servers were offline during the outage.

Power was fully restored at approximately 8:40am. Most ITS services were online by 9:15am. A few remaining services (smtp, some websites, and jira) were restored by about 10:40am.




15:22: CDT

Crashplan servers become unresponsive in a way that wasn't detected or reported by the monitoring tools. Restarted the crashplan service on the 6 storage nodes and the master process on crashplan.ncsa.illinois.eduAll crashplan storage nodes were unavailable. System backups were stalled while store points were unavailable. Restore attempts would result in failure.Backups and restores are continuing to flow into and out of crashplan after restarting the services.
6/29/2015 20:00 CDT6/29/2015 23:05 CDTITS network switch upgrade to add 64 10G baseT ports. was offline. Other ITS servers experienced two momentary outages of about 1 minute each when we moved to/from a temporary networking configuration.All services have been restored.
6/25/2015 02:37 CDT6/25/2015 07:30 CDTSome power issues affected internal network connectivity and server availability.Several VMs hosted by IT Services went into readonly mode. Email, kerberos, otp, wiki, and several other services were down.Servers were rebooted and services restored.
6/11/15 8:256/11/15 9:10Crashplan was upgraded to 4.2.1Crashplan backups were delayed for 15 minutes while each storage node upgraded to the latest code release.Now running CrashPlanProE version 4.2.1





Crashplan was upgraded to 4.2Crashplan backups were delayed while servers were upgraded to latest release. Clients were then pushed an update.

Now running CrashPlanProE version 4.2

3/18/2015 12:153/18/2015 13:20

One of the AFS file servers,, ran out of disk space unexpectedly.

Several websites became unresponsive waiting on AFS.Space has been restored on the AFS file server and the server was restarted. All services were restored.
2/25/15 6:052/25/15 8:50Storage server papyrus hung. System was restarted when staff arrived at office.some AFS volumes were unavailable from around 6am till 8:50amServices were restored. Cause of hang is being investigated but nothing definitive has been found.
1/21/15 13:301/21/15 17:05Some wiki pages on were intentionally unavailable due to a critical security issue.Wiki pages containing parenthesis or curly braces in their title were blocked while we patched the security vulnerability.Patch was applied to the wiki service and all pages are now available.
1/14/15 17:301/14/15 18:45The public-linux VM's settings were edited for a minor change and the VM paused while waiting for a human server was offline.Located the hidden confirmation window and the server resumed normal operation.
1/12/15 21:311/12/15 22:49NCSA's primary LDAP service died.All LDAP services were offline. Users may not have been able to login into the Wiki, Jira, public-linux, RT, and other services.The LDAP service was restarted.
12/22/14 08:0012/22/14 10:00/var/forward NFS mount removed from pop server/var/forward is no longer accessible from servers e.g. public-linux serverUsers can no longer edit their .forward or procmail files on NCSA's pop server. Most mail rules should now happen in CITES Exchange.
12/16/14 16:4512/16/14 17:45Crashplan Pro was upgraded from to 4.1.6Crashplan webconsole was unavailable for a few minutes. CrashPlan clients restarted after downloading their updatesCrashplan Pro infrastructure server/clients were upgraded to latest release.
12/16/14 19:1012/16/14 20:05RSA Authentication Manager servers were patched against additional security issuesThe self-service site at was off-line for about 10 minutes.RSA system is protected for additional security vulnerabilities. - Now running RSA Authentication Manager 8.1 SP1
12/12/14 11:15

12/12/14 12:10

RSA Authentication Manager servers were patched against a security vulnerability.The self-service site at was off-line for about 10 minutes.RSA system is protected for additional security vulnerabilities.
12/10/14 12:0012/10/14 12:15Upgrade public-linux to CentOS EL 7. Reset all public-linux hostnames to be CNAMEs for public-linux interactive server upgraded to latest CentOS EL 7 and to use accounts from LDAP.New host in place.
11/05/14 13:5011/05/14 13:50Confluence wiki users that didn't really need access were purged to create more room to grow.BW users who do not need access to the NCSA wiki have been removed.NCSA wiki working as expected with fewer users.
10/31/14n/aThe services hosted on the server went off-line due to system failure.

SVN & CVS repositories hosted on the server are being migrated to SVN on the server.

If you need access to your repository send email to and we will prioritize the conversion of your repository to
10/27/14 15:2410/27/14 15:32Confluence wiki was rebooted to resolve filesystem issues that started around 01:54am this morning. During that time users could not update file attachments on the wiki.
file attachments could not be saved
Repaired filesystem and working as expected
10/10/1410/13/14 17:45Beta service - ownCloud has been repaired. Previous users may notice issues with their sync clients. System is now using MySQL instead of sqlite3. should be more reliable now
10/9/14 8:1610/9/14 9:20RSA services were upgraded to mitigate the shellshock security issue.RSA OTP self-serviceAll RSA services are working normally again. If you had an issue during that time frame, please try again.
10/7/14 22:0310/7/14 22:14The master MySQL server ( was unavailable for approximately 10 minutes while it was updated to MySQL 5.5.40 and rebooted.MySQL was successfully updated to MySQL 5.5.40

There was a short unscheduled upgrade to the RSA infrastructure this morning to mitigate a serious security vulnerability.

All RSA OTP servicesAll RSA services are working normally again. If you had an issue during that time frame, please try again.