Sunday, November 14, 2010

Windows Azure Instances stuck in the Initiatilizing, Busy or Stopping State

Although there could be many reasons for an instance to be stuck in a non-ready (busy/Initiatilizing/stopping) state, I have experienced a common (and silly) reason. It's the the number of instances that you have configured in the portal ( or through the "ServiceConfiguration.cscfg" file ).

I spent about one hour uploading (and of course deleting) exact same Azure package with a single instance of the web and worker role.  Sometimes it would upload and put the instances in Ready state right away, and sometimes it would just get stuck in the initializing or busy state. I have started experiencing this behaviour in last few months. Basically, this has become more frequent with the increase in Azure adoption. Also, this is more frequent duirng day-time (hypothesizing) when there are probabaly more people trying to upload thier packages to the Azure portal.

It turns out (and as warned through a dialog box) when you are trying to upload the package to Windows Azure portal, Microsoft doesn't guarantee any SLA if you are running only a single instance of either web or worker role. In addition to that, I am almost getting certain (after wasting one hour to figure any other logical reason) that they have put in some logic in the Load balancer that de-prioritizes any single instance configuration in favour of the multi-instance configuration during package upload. So, if you are short on time (or want to save some time ) during package upload and instantiation on Azure portal, i would recommend changing the number of instances to atleast 2 even during the develop-deploy-test cycle. This may cost little more in Azure cost (two instance cost VS one ) but I would say it's definitely worth it.

Bottomline - Since Microsoft SLA (99.95%) is not valid for single instance deployment in terms of availability, keeping atleast two instances of your role will save lot of time and headache.

On a side note, there are several other benefits to maintaining atleats two or more instances. For example, a single instance (and therefore the entire role ) can become unavailable whenver the role instance is being restarted by Azure. Azure can restart a role instance for many different reasons, including:
a)      Role instance is being recycled by Load Balancer
b)      Load Balancer Issues
c)      Being Upgrade to a new OS version
d)      Being re-booted to apply a patch or to resolve some other issue

Following blog from Toddy is a good list of other issues that may be causing your roles to be stuck in  Initiatilizing, Busy or Stopping State: http://blog.toddysm.com/2010/01/windows-azure-deployment-stuck-in-initializing-busy-stopping-why.html

Hope this  helps!
Piyush