Sunday, November 14, 2010

Windows Azure Instances stuck in the Initiatilizing, Busy or Stopping State

Although there could be many reasons for an instance to be stuck in a non-ready (busy/Initiatilizing/stopping) state, I have experienced a common (and silly) reason. It's the the number of instances that you have configured in the portal ( or through the "ServiceConfiguration.cscfg" file ).

I spent about one hour uploading (and of course deleting) exact same Azure package with a single instance of the web and worker role.  Sometimes it would upload and put the instances in Ready state right away, and sometimes it would just get stuck in the initializing or busy state. I have started experiencing this behaviour in last few months. Basically, this has become more frequent with the increase in Azure adoption. Also, this is more frequent duirng day-time (hypothesizing) when there are probabaly more people trying to upload thier packages to the Azure portal.

It turns out (and as warned through a dialog box) when you are trying to upload the package to Windows Azure portal, Microsoft doesn't guarantee any SLA if you are running only a single instance of either web or worker role. In addition to that, I am almost getting certain (after wasting one hour to figure any other logical reason) that they have put in some logic in the Load balancer that de-prioritizes any single instance configuration in favour of the multi-instance configuration during package upload. So, if you are short on time (or want to save some time ) during package upload and instantiation on Azure portal, i would recommend changing the number of instances to atleast 2 even during the develop-deploy-test cycle. This may cost little more in Azure cost (two instance cost VS one ) but I would say it's definitely worth it.

Bottomline - Since Microsoft SLA (99.95%) is not valid for single instance deployment in terms of availability, keeping atleast two instances of your role will save lot of time and headache.

On a side note, there are several other benefits to maintaining atleats two or more instances. For example, a single instance (and therefore the entire role ) can become unavailable whenver the role instance is being restarted by Azure. Azure can restart a role instance for many different reasons, including:
a)      Role instance is being recycled by Load Balancer
b)      Load Balancer Issues
c)      Being Upgrade to a new OS version
d)      Being re-booted to apply a patch or to resolve some other issue

Following blog from Toddy is a good list of other issues that may be causing your roles to be stuck in  Initiatilizing, Busy or Stopping State: http://blog.toddysm.com/2010/01/windows-azure-deployment-stuck-in-initializing-busy-stopping-why.html

Hope this  helps!
Piyush

Sunday, November 07, 2010

SQL Azure Federation: Horizontal Scaling in Cloud !!

  One of the concerns I hear a lot about Azure is the need for users to select the DB size when signing up for a SQL Azure instance. The maximum DB size that you can sign up for is 50 GB currently, and  makes lot of peopel worried about the scalability of SQL Azure. 50GB may be good enough for most of small and medium size web applciations, it's nowhere near what many large websites, LOB applications and data warehouses need.

     Microsoft's solution to this problem was something called  "Sharding".  Sharding is a technique that has been in existence for long time now and supports horizontal partitioning of Databases. Essentailly, it requires you to create a bunch of Azure DBs, treat each one as a separate partition, and programmatically direct your query to the correct partition “shard”. If it's a complex query, you will have to do the hard work of breaking it up based on your partition key, and redirect to right partition, and merge the result back.
   Here is one article explains how to sclae out SQL Azure using Horizontal partitioning. The partitioning logic is implemented in Data Access Layer  using LINQ. Painful!!

  Although this solution works, but expecting developers to write their own  logic to manage partitions, redirect queries, etc is little over the edge. Since most of the on-premise DBs provide this feature out of box, it was a big hurdle in SQL Azure adoption by large companies. Microsoft unvieled a much elegant solution duirng PDC.
   It's called "SQL Azure Federation" . It is planned to be released in early 2011.

   SQL Azure will provide support for explicit horizontal partitioning complete with support for new T-SQL keywords and commands like CREATE/USE/ALTER FEDERATION and CREATE TABLE...FEDERATE ON. Once you have setup the right federation key, re-directing queries to correct "shard" is taken care by the SQL Azure Engine.

  This new feature essentially makes the 50GB size limitation almost irrelevant for most of the data storage requirements. This coupled with the elastic "provisioning" nature of the cloud will make it a compelling alternative for many organizatiosn out there who are dealing with large datasets and scalability issues. I, for one, cannot wait to try this out.

Here is the actual session by "Lev Novik" from the PDC titled "Building Scale-Out Database Solutions on SQL Azure":

Cheers!!