All Systems Operational
OSG Connect Operational
90 days ago
99.96 % uptime
Today
Login 4 Operational
90 days ago
100.0 % uptime
Today
Login 5 Operational
90 days ago
100.0 % uptime
Today
Stash Filesystem Operational
90 days ago
99.84 % uptime
Today
OSG GlidenWMS Frontend ? Operational
90 days ago
100.0 % uptime
Today
StashCache Operational
90 days ago
99.76 % uptime
Today
StashCache Redirector Operational
90 days ago
100.0 % uptime
Today
CVMFS Synchronization Operational
90 days ago
100.0 % uptime
Today
Data Federation Accounting Service ? Operational
90 days ago
99.3 % uptime
Today
Hosted CEs Operational
90 days ago
99.35 % uptime
Today
Hosted CE Infrastructure ? Operational
90 days ago
99.35 % uptime
Today
Message Bus ? Operational
90 days ago
100.0 % uptime
Today
GlideinWMS Factory ? Operational
90 days ago
100.0 % uptime
Today
OASIS ? Operational
90 days ago
100.0 % uptime
Today
Network Monitoring Pipeline ? Operational
90 days ago
100.0 % uptime
Today
Software Repositories Operational
90 days ago
99.81 % uptime
Today
Yum Repos ? Operational
90 days ago
99.63 % uptime
Today
GridCF Repo ? Operational
90 days ago
100.0 % uptime
Today
Accounting Operational
90 days ago
99.3 % uptime
Today
GRACC Frontend ? Operational
90 days ago
99.3 % uptime
Today
GRACC Backend ? Operational
90 days ago
99.3 % uptime
Today
GRACC APEL Reporting ? Operational
90 days ago
99.3 % uptime
Today
Websites Operational
90 days ago
99.73 % uptime
Today
Display ? Operational
90 days ago
99.25 % uptime
Today
Main Website ? Operational
90 days ago
100.0 % uptime
Today
DNS ? Operational
90 days ago
100.0 % uptime
Today
OSGConnect Website ? Operational
90 days ago
100.0 % uptime
Today
Topology ? Operational
90 days ago
99.41 % uptime
Today
Hosted Submit Operational
90 days ago
100.0 % uptime
Today
Hosted Submit Infrastructure ? Operational
90 days ago
100.0 % uptime
Today
Hosted GlideinWMS Operational
90 days ago
99.25 % uptime
Today
IGWN GWMS Frontend Operational
90 days ago
99.25 % uptime
Today
JLAB GWMS Frontend Operational
90 days ago
99.25 % uptime
Today
GLUEX GWMS Frontend Operational
90 days ago
99.25 % uptime
Today
UCSD CMS GWMS Frontend Operational
90 days ago
99.25 % uptime
Today
UCSD CMS VO Collector Operational
90 days ago
99.25 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Past Incidents
Nov 29, 2021

No incidents reported today.

Nov 28, 2021

No incidents reported.

Nov 27, 2021
Resolved - A sufficient number of hosts have been rebooted to restore services; marking incident as closed. We will continue to monitor over the weekend to check for services that are unstable.
Nov 27, 19:07 UTC
Investigating - The datacenter at UW-Madison hosting several OSG services lost cooling capacity overnight (starting at approximately 12:30am), resulting in several hosts going offline at 2:00am.

Services are currently being restarted.
Nov 27, 15:46 UTC
Nov 26, 2021

No incidents reported.

Nov 25, 2021

No incidents reported.

Nov 24, 2021

No incidents reported.

Nov 23, 2021

No incidents reported.

Nov 22, 2021

No incidents reported.

Nov 21, 2021

No incidents reported.

Nov 20, 2021
Resolved - This incident has been resolved.
Nov 20, 03:45 UTC
Monitoring - Cooling returned to the data center at approximately 6:00PM Central. Over the last two hours, we've been inspecting the cluster worker nodes and restarting the infrastructure. As of 8:30PM Central, services are beginning to be restored. We expect the last services to be restored over the next 15 minutes.

The facilities team has notified us there will be another short follow-up cooling outage at 7:00am Central on Monday of up to 2 hours in order to finalize the maintenance performed today.
Nov 20, 03:01 UTC
Identified - We have restored the yum repo mirror list, which will allow yum installations and updates to succeed.
Nov 19, 18:49 UTC
Investigating - During maintenance on the cooling systems in the UW-Madison machine room, the temporary cooling system has failed to provide the expected capacity. This has resulted in several hosts automatically shutting down due to temperature alarms and a number of unplanned service outages.

We are investigating whether anything can be brought back up safely; if not, the maintenance is expected to finish at 5PM central today.
Nov 19, 14:23 UTC
Nov 19, 2021
Nov 18, 2021

No incidents reported.

Nov 17, 2021

No incidents reported.

Nov 16, 2021

No incidents reported.

Nov 15, 2021

No incidents reported.