Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We've heard from Internet2 that this is a widespread outage.
"The Internet2 network is experiencing widespread outages of unknown origin. This is causing frequent iBGP and LSP resets at various points across the footprint. Network engineers are engaged and actively investigating to isolate the source of the outage and will provide an update as soon as the issue is better understood. Internet2 is declaring a major incident and will share updates on a regular cadence"
START
ENDWhat System/Service is affectedWhat is happening?What will be affected?Contact PersonStatus
2019-10-16???DNS1 IPv6 reachability 
Due to some switch issues, DNS1 IPv6 address is not reachable.  DNS2 ipv6 address remains online.help+neteng@ncsa.illinois.edu2019-12-10 13:45?Internet2 ConnectivityOur connectivity to Internet2 over our 100G I2 circuit is currently intermittent. Networking is investigating.Many different external resources, data transfers, sessions, etc. to various destinations may be unstable until this problem is resolved.help+neteng@ncsa.illinois.edu


Upcoming Scheduled Maintenance

...

StartEndWhat System/Service was affected?What happened?What was affected?

Contact Person

Status
2019-12-10 13:452019-12-10 16:55Internet2 ConnectivityInternet2 Engineers isolated the issue to a malformed route update coming from an external peer to one of its nodes in Ashburn, VA. As this update was propagated throughout the Internet2 Network, it triggered a bug on the Internet2 routers and caused all internal BGP sessions of each router to rapidly flap, thus causing instability across the footprint. Engineers mitigated the issue by placing a filter on the specific peer to reject the malformed packet. The Major Incident has been resolved at this point.Many different external resources, data transfers, sessions, etc. to various destinations.help+neteng@ncsa.illinois.edu

Connectivity has stabilized. Please report any issues should they arise.

2019-12-22019-12-2 afternoonWireless network Tech Services reports they are having authentication issues affecting Wifi and VPN.  Engineers are working on the problem. Tech Services Issue Description.NCSAnet, IllinoisNet wireless are non functional at the moment. NCSA wired network remains available. IllinoisNet_guest is also functional. help+neteng@ncsa.illinois.eduTroubleshooting in progress
2019-11-14 18:002019-11-14 19:00Exit-West RouterSoftware UpgradesThis should not be user impactful.  All traffic will re-route via the other router.help+neteng@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-11-14 5:00 AM2019-11-14 3:30 PMNearline EndpointIssue with one storage librarySome Globus transfers were stalled for the period of the outagebw+storage@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

Nov 7 10:00Nov 7 14:00ICCP.  All login nodes will be down.Reroute some IB cables between Core switches and compute nodes.  Changing topology on Subnet Manager.Scheduler will be pause. No users access to login nodes.  All running jobs will be kill.  help@campuscluster.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-11-05 07:002019-11-05 16:53iForgeQuarterly MaintenanceAll systems will be unavailable during the maintenanceiforge-admin@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-12019-11-1NCSA Windows Domain ControllersITS Migrated all Windows Systems to using the Campus Domain.  The existing NCSA Windows Domain has been decommissioned and shutdown.NCSA Windows Systemshelp+its@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-23

8 a.m.

2019-10-23

12:00 p.m.

Core-West Code upgrades will be performed on Core-West network switch.This should not be user impacting.  All traffic will flow through the redundant Core.neteng+help@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-22 06:12

2019-10-22 07:18

Jira and WikiDuring reboots for system patches the wiki and Jira got stuck in a state that was not providing data to the users.Only web access to these tools was impacted.help+its@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-16 08:002019-10-16 20:30ICC system wideQuarterly maintenanceAll services on ICChelp@campuscluster.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-16

8 a.m.

2019-10-16

12:00 p.m.

Core-East Code upgrades will be performed on Core-East network switch.This should not be user impacting.  All traffic will flow through the redundant Core.neteng+help@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-15 11:45am2019-10-15 11:56AM npcf-exit-east BGP peering flapped over I2 AL2S circuitTraffic got re-routed but some WAN services were impacted as reported by users. help+neteng@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-10 07:00

2019-10-10 07:30

mysql.ncsa.illinois.eduSome table repairs broke replication; this maintenance will update the replicas with newer databases so the service will work as expected again.Wiki, JIRA, and some web sites will stop working.  Email forwarding to user accounts at NCSA will be delayed during the outage.lindsey@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-01


2019-10-03NCSA-Print & Building Printers

Some printers are having issues connecting to the NCSA Print Server.  

After updating drivers on the print server, public printers are working as expected.

Printinghelp+its@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-03 6AM

2019-10-03

7:45AM

Jira and WikiDuring reboots for system patches the wiki and Jira got stuck in a state that was not providing data to the users.Only web access to these tools was impacted.help+its@ncsa.illinois.edu

Status
subtletrue
colourGreen
titleComplete

2019-10-01 7AM2019-10-01
8:30PM
Blue WatersNGA work load scheduled testingscheduler testing for NGA workloadDavid King

Status
subtletrue
colourGreen
titleComplete

2019-10-01 10AM2019-10-01
12:04PM
Blue WatersEPO 4 racks lost xdp (cooling)
CRAY warm swapped racks back into system successfully.
scheduler, some computes missing and Gemini was rerouted

Status
subtletrue
colourGreen
titleComplete

2019-10-01 07:00

2019-10-01 07:30

mysql.ncsa.illinois.eduMySQL servers needed to be synchronized to convert the server in NPCF back to a replicated host.Wiki, JIRA, and some web sites stopped working.  Email forwarding to user accounts at NCSA was delayed during the outage.lindsey@ncsa.illinois.edu
Status
subtletrue
colourGreen
titleComplete


...