There was a "major" power failure today, from approximately 11:30 through to 12:45, due to technical problems with Nova Scotia Power.
During this time, all POLAN server resources were offline and inaccessible.
Following power restoration, a number of services were slow to be brought back online, due to the large number of manual kludes and tweaks required, ie, TDC personal hands-on-intervention work.
Services which are known to have been slow in restoration include,
- MatLab license server (Jade failed to automatically power on after power was restored)
- UPS Power monitoing (Hydra - failed to start this service)
- PC Cluster nodes - failed to power on after power was restored
- PC Cluster SunGrid Engine - failed to start because NFS failed to start on Birkenhead, the cluster controller
- Shackle, the backup server, required a power-button-push to turn back on.
As of 3:00pm approximately, I believe everything is "back to normal and working properly"
If you believe anything is still broken at this time, please do let me know ASAP.