There was a "major" power failure today, from approximately 11:30 through to 12:45, due to technical problems with Nova Scotia Power.

During this time, all POLAN server resources were offline and inaccessible.

Following power restoration, a number of services were slow to be brought back online, due to the large number of manual kludes and tweaks required, ie, TDC personal hands-on-intervention work.

Services which are known to have been slow in restoration include,

  • MatLab license server (Jade failed to automatically power on after power was restored)
  • UPS Power monitoing (Hydra - failed to start this service)
  • PC Cluster nodes - failed to power on after power was restored
  • PC Cluster SunGrid Engine - failed to start because NFS failed to start on Birkenhead, the cluster controller
  • Shackle, the backup server, required a power-button-push to turn back on.

As of 3:00pm approximately, I believe everything is "back to normal and working properly"

If you believe anything is still broken at this time, please do let me know ASAP.