Data Center Power Outage
Incident Report for OSG Consortium
Resolved
Services appear to be running well overnight - closing out the incident.
Posted Jun 27, 2021 - 17:47 UTC
Monitoring
All OSG production services have been restored from backups - we believe there is no data loss for these services.

All services appear to be functional but we will leave the incident open for a few more hours to monitor the situation.
Posted Jun 27, 2021 - 00:04 UTC
Update
We are continuing to work on a fix for this issue.
Posted Jun 26, 2021 - 21:31 UTC
Update
We are continuing to work on a fix for this issue.
Posted Jun 26, 2021 - 21:06 UTC
Update
We are continuing to work on a fix for this issue.
Posted Jun 26, 2021 - 21:05 UTC
Update
We are continuing to work on a fix for this issue.
Posted Jun 26, 2021 - 21:03 UTC
Update
We are continuing to work on a fix for this issue.
Posted Jun 26, 2021 - 21:01 UTC
Update
The Kubernetes infrastructure has been rebuilt from backups. Systems are restoring and some services are coming back online.
Posted Jun 26, 2021 - 20:23 UTC
Identified
Power has been restored and the machines are back up. Administrative staff is working on getting the Kubernetes infrastructure back online, which serves the affected OSG central services.
Posted Jun 25, 2021 - 18:51 UTC
Investigating
One of the data centers hosting OSG services experienced a power outage last night causing some services to be inaccessible. Administrative staff are currently repowering the affected hosts.
Posted Jun 25, 2021 - 15:43 UTC
This incident affected: Software Repositories (Yum Repos), Websites (Display, Topology), Hosted GlideinWMS (IGWN GWMS Frontend, JLAB GWMS Frontend, GLUEX GWMS Frontend, UCSD CMS GWMS Frontend, UCSD CMS VO Collector), and Hosted CEs (Hosted CE Infrastructure).