We have (or I better use the past tense "we had") a master SQL 2012 SP4 Enterprise box replicating to a remote twin SQL box with WFC cluster and AlwaysOn functionalities. Our SQL consultants set it up several months ago and they said it was the best (and cost-effective) HA scenario for our ERP system and our developers could also offload the main system running reports on the remote read-only DBs. That looked great, yeah.
Last week the remote SQL server had a major crash (100% CPU usage, system barely usable) and the master SQL server was bogged down too and the Cluster service couldn't start at all... we had to reboot to secondary server to have a fully working master SQL server... till a new crash (freeze) at the remote site several hours later that caused the main system to crawl again...
So my question is: how can a passive server/copy take down the master SQL box? how can the Cluster service on the master SQL server be so fragile and unreliable? Is this a normal behavior in an AlwaysOn HA scenario? is it something related to SQL 2012? immature system??
Thank you!