Showing posts with label Large Azure Deployment. Show all posts
Showing posts with label Large Azure Deployment. Show all posts

Wednesday, 6 November 2013

Azure SDK 2.2 Features & Migration

 

Brief Synopsis

The SDK 2.2 is not major upgrade it has brought more features around remote debugging in cloud which was a big ask till now & Windows Azure Management Libraries from a Developer per say. The Windows Azure Service Bus partition queue and topics multiple message broker will help in better availability, as each queue or topic is assigned to one message broker which is single point of failure, now with the new feature of multiple message broker assigned to a queue or topic. Please read the Q&A at the end to know the nuances with the same & the approach to move to Azure SDK 2.2 from generic project stand point of view

High Level following are the new features

  • Visual Studio 2013 Support
  • Integrated Windows Azure Sign-In support within Visual Studio
  • Remote Debugging Cloud Services with Visual Studio – Very Relevant to  Developers
  • Firewall Management support within Visual Studio for SQL Databases
  • Visual Studio 2013 RTM VM Images for MSDN Subscribers
  • Windows Azure Management Libraries for .NET – Very Relevant to  Deployment Team
  • Updated Windows Azure PowerShell Cmdlets and ScriptCenter
  • Topology Blast – Relevant to Deployment Team
  • Windows Azure Service Bus – partition queues and topics across multiple message brokers – Relevant to  Developers. All Service Bus based projects have to move ASAP.

Below covered are only the highlighted areas.

Remote Debugging Cloud Resources within Visual Studio

Today’s Windows Azure SDK 2.2 release adds support for remote debugging many types of Windows Azure resources. With live, remote debugging support from within Visual Studio, you are now able to have more visibility than ever before into how your code is operating live in Windows Azure.  Let’s walkthrough how to enable remote debugging for a Cloud Service:

Remote Debugging of Cloud Services

Note: To debug the web or worker role should be on Azure SDK 2.2

To enable remote debugging for your cloud service, select Debug as the Build Configuration on the Common Settings tab of your Cloud Service’s publish dialog wizard:

clip_image001

Then click the Advanced Settings tab and check the Enable Remote Debugging for all roles checkbox:

clip_image002

Once your cloud service is published and running live in the cloud, simply set a breakpoint in your local source code:

clip_image003

Then use Visual Studio’s Server Explorer to select the Cloud Service instance deployed in the cloud, and then use the Attach Debugger context menu on the role or to a specific VM instance of it:

clip_image004

Once the debugger attaches to the Cloud Service, and a breakpoint is hit, you’ll be able to use the rich debugging capabilities of Visual Studio to debug the cloud instance remotely, in real-time, and see exactly how your app is running in the cloud.

clip_image005

Today’s remote debugging support is super powerful, and makes it much easier to develop and test applications for the cloud.  Support for remote debugging Cloud Services is available as of today, and we’ll also enable support for remote debugging Web Sites shortly.

Windows Azure Management Libraries for .NET (Preview)- Automating PowerShell

Windows Azure Management Libraries are in Preview!

What do Azure Management Libraries provide, the control of creation, deployment and tear down resources which previously has been at a PowerShell level now will be available in the code.

Having the ability to automate the creation, deployment, and tear down of resources is a key requirement for applications running in the cloud.  It also helps immensely when running dev/test scenarios and coded UI tests against pre-production environments.

These new libraries make it easy to automate tasks using any .NET language (e.g. C#, VB, F#, etc).  Previously this automation capability was only available through the Windows Azure PowerShell Cmdlets or to developers who were willing to write their own wrappers for the Windows Azure Service Management REST API.

Modern .NET Developer Experience

We’ve worked to design easy-to-understand .NET APIs that still map well to the underlying REST endpoints, making sure to use and expose the modern .NET functionality that developers expect today:

  • Portable Class Library (PCL) support targeting applications built for any .NET Platform (no platform restriction)
  • Shipped as a set of focused NuGet packages with minimal dependencies to simplify versioning
  • Support async/await task based asynchrony (with easy sync overloads)
  • Shared infrastructure for common error handling, tracing, configuration, HTTP pipeline manipulation, etc.
  • Factored for easy testability and mocking
  • Built on top of popular libraries like HttpClient and Json.NET

Below is a list of a few of the management client classes that are shipping with today’s initial preview release:

.NET Class Name

Supports Operations for these Assets (and potentially more)

ManagementClient

Locations
Credentials
Subscriptions
Certificates

ComputeManagementClient

Hosted Services

Deployments

Virtual Machines

Virtual Machine Images & Disks

StorageManagementClient

Storage Accounts

WebSiteManagementClient

Web Sites

Web Site Publish Profiles

Usage Metrics

Repositories

VirtualNetworkManagementClient

Networks

Gateways

Automating Creating a Virtual Machine using .NET

Let’s walkthrough an example of how we can use the new Windows Azure Management Libraries for .NET to fully automate creating a Virtual Machine. I’m deliberately showing a scenario with a lot of custom options configured – including VHD image gallery enumeration, attaching data drives, network endpoints + firewall rules setup - to show off the full power and richness of what the new library provides.

We’ll begin with some code that demonstrates how to enumerate through the built-in Windows images within the standard Windows Azure VM Gallery.  We’ll search for the first VM image that has the word “Windows” in it and use that as our base image to build the VM from.  We’ll then create a cloud service container in the West US region to host it within:

clip_image006

We can then customize some options on it such as setting up a computer name, admin username/password, and hostname.  We’ll also open up a remote desktop (RDP) endpoint through its security firewall:

clip_image007

We’ll then specify the VHD host and data drives that we want to mount on the Virtual Machine, and specify the size of the VM we want to run it in:

clip_image008

Once everything has been set up the call to create the virtual machine is executed asynchronously

clip_image009

In a few minutes we’ll then have a completely deployed VM running on Windows Azure with all of the settings (hard drives, VM size, machine name, username/password, network endpoints + firewall settings) fully configured and ready for us to use:

clip_image010

TopologyBlast

This new functionality will allow Windows Azure to communicate topology changes to all instances of a service at one time instead of walking upgrade domains. This feature is exposed via the topologyChangeDiscovery setting in the Service Definition (.csdef) file and the Simultaneous* events and classes in the Service Runtime library.

Windows Azure Service Bus – partition queues and topics across multiple message brokers

Service Bus employs multiple message brokers to process and store messages. Each queue or topic is assigned to one message broker. This mapping has the following drawbacks:

· The message throughput of a queue or topic is limited to the messaging load a single message broker can handle.

· If a message broker becomes temporarily unavailable or overloaded, all entities that are assigned to that message broker are unavailable or experience low throughput.

Q&A

Q. Can I use Azure SDK 2.2 to debug Web Role , Worker Role using earlier SDK.

A. No, you need to have your roles migrated to SDK 2.2. For older role you can only get the diagnostic information out of Visual Studio if installed 2.2.

clip_image011

Q. What are typical issues while migrating from 1.8 to 2.2?

Worker Roles and Web Role Recycling

I have 3 worker roles and a web role in my project and I upgraded it to the new 2.2 SDK (required in VS2013). Ever since the upgrade, all of the worker roles are failing and they instantly recycle as soon as they're started.

Post can be found here- http://stackoverflow.com/questions/19717215/upgrade-to-azure-2-2-sdk-is-causing-roles-to-fail

Not able to update the Role after upgrading

I recently worked on an issue where the following error was being thrown while deploying the upgraded role to Windows Azure.  You just upgraded the SDK to 2.1 or 2.2 and you start getting the following error while deploying the role.

Link to the post http://blogs.msdn.com/b/cie/archive/2013/10/31/not-able-to-upload-role-after-upgrading-the-sdk.aspx

Q. Steps to Migrate to Azure SDK 2.2

A. Open the Azure project in Visual Studio 2012,

  1. For upgrading your project is via the Properties windows of the Cloud Project. You will see the following screenshot.
  2. clip_image013
  3. Follow through the upgrade process fix the errors,
  4. Run the Project locally to see if any errors fix the same.
  5. Check In the code post all fixes
  6. Test the same in Dev. environment to see if this breaking. There is potential chance of breaking due to dependencies.

Note: The web & worker role tend go into inconsistent state due library dependency mismatch. This will have to fixed

Generic Migration to Azure SDK 2.2- High Level Approach

The suggested approach is to start with one component, one web role and one worker role WCF Rest and see the impact in terms of issues and then decide then timelines for others. The POC will be done in 1 Sprint, the candidate are the following

  • Component – Reusable Component
  • Web Role – Portal Web
  • Worker Role – Portal Worker

 

Links

· Installation of Azure SDK 2.2 - http://www.windowsazure.com/en-us/downloads/archive-net-downloads/

Sunday, 7 April 2013

Large-scale Implementation on Azure Platform

 

I have been closely reading the Azure CAT (Customer Advisory Team) which helps a lot of customer deliver large, complex projects. The Azure CAT site can be found here guidance on how to use different architectural artefacts in Azure. After some digging this what I have come to understand are some of the largest implementation on Azure Platform. This is however some amount of reverse engineering and research. Hope this is helpful to the readers.

So what has Azure handled so far from a number standpoint of view?

There can various architectures which Azure addresses but in a nutshell its enterprise ready. From a number standpoint of view this is what Azure is handle for applications where each line item below is an individual applications

  • Largest sharded SQL database – 20 TB- Sql Azure maximum container size of 150 GB, so there have been multiple containers used to resolve the size issues.Of course the querying strategy around this has to be well defined.
  • Largest number of database- 11,000
  • Most number of worker instances – 24,000- This is an on demand application which spins 24k worker role instance to perform some complex calculation and shuts down. This is more like HPC to address complex algorithms.
  • Largest Customer Application- 50 PB

So what are the largest case studies in Azure?

Florida Presidential Election 2012:

This is not the largest but it was highly mission critical. Find the complete write up here.

This was a mission critical for a couple of days with some very high volume numbers to manage.

Florida Election Presidential Election 2012 had the following metrics

  • Max peak 40k page hits / second
  • 6 million hits peaked in 1 hour
  • Caching in front DB was a big architectural success.- The very nature of the application been short lived it made sense to push as much data into the cache and have separate update strategy
    • 3 minute TTL
    • Separate Worker Role to refresh cache.

What does the architecture look like?

 

image

Enight Florida Presidential Election – From an architectural decomposition standpoint of view below it is the layered architectural explanation below.

Presentation Layer:

Main URL - https://enlight.elections.myflorida.com using Azure Traffic Manager with Availability , performance fail over between the Primary Site US East Coast & US North Central.

The Primary site of the application was hosted on US East Coast.

Enight Public Site: This is a set of web roles hosting voter turnout, election result Florida state wide and other real time election related data. The Application had a lot of real time data which was expected to come in from various data sources. The web role will read the data from the cache if valid if not will get the data from the database.

Enight SOE-  At a very high level the Enight SOE acts as data synchronizer between the primary and the secondary. Additionally it reads the data from the blobs and also pushes it to the secondary.

Caching Layer

Azure Dedicated Cache Roles Used. Given the frequency of reads and writes been very high in a short period of time, It was advisable to go with in memory database i.e. cache. All the entities would continue to remain in the database until 3 minutes TTL. A separate worker role to refresh the cache. The Sql Azure Database reads, writes ended up using a CQRS based patterns to manage the reads, writes – this is a guess.

Database Layer

The Storage Layer comprises of Sql Azure and Blob Storage( data coming in from various counties.

Identity & Access Management

The IDM used here is the standard provided by Azure Access Control Service. The out of box features of Azure ACS like integration with Facebook and other social media very typically used.

Azure Auto Scaling Application

Given that peak traffic could get to 40k pages hit/second. The need of a good auto scaling application must have been required to get the elastic advantage.

* Monitoring Tools most probably used were Cerebrata.

Bing Games

Bing Games- MSFT has finally got to a culture of eating its own dog food, Bing Games is one such example where the Scoring Tracking and Ranking System was entirely on Azure. Find the articles here.

For some reason case study has been removed from the MSFT site.Thanks to Google I managed to get the cached page here.

Below are the high level metrics

  • 1900 instances- app servers (various roles)
  • 398 SQL Azure database- Scale Out
  • 30 million unique users/ month
  • 200k concurrent users
  • 3 months, 7 developers.
  • 99.975% uptime in past 12 months

Why did they pick up Azure Database?

Given the database access patterns it was better to pick Sql Azure over Table Storage. This is however not a defacto standard it depends on the requirement. The choice of Sql Azure was based on the fact they had an existing application already in Sql. The testability i.e easy to pre-populate with millions of records faster.

Partition Strategy

Each user the data would remain the same database. So the scale out based on users was easier and faster. The partitioning strategy is static by nature.

Production Statistics

  • 1200 Azure database request/second spread across all partitions during peak loads.
  • 18k connections in Connection Pool and which could grow with traffic

Database

  • 90-10 read vs. writes

Glassboard.com

Glassboard.com is similar to facebook is a private social network for groups. This is a mobile only application.

The Initial Architecture

image

The Initial Architecture pretty straight forward REST based API and table, queue or blob storage. Additional components such as social collaboration and analytics. 

My personal opinion is this architecture is not quite right, below section I explain why and the glassboard folks have found & corrected the same.

The Wrong Architecture for Devices – Which otherwise would be correct in all cases……..

What started up as REST API based programming turned out a total disaster as the underlying storage was table storage. Its not that the table storage was a disaster. Its just that when we start designing the very academic way of doing thing we have an API for every call, soon to be realized the table cost for each read will start hitting your pocket. With a little fiddler exercise one can realize the same

<excerpt from Glassboard site>

To demonstrate the problem I have set up my Glassboard API instance to do no caching whatsoever. I then ran a unit test which simulates a user getting his Newsfeed, then posting a status. I highlighted each set of repeated calls in a different color.

Repeated Storage Requests

</excerpt from Glassboard site> Find the complete commentary here

Not to get alarmed the architectural change on going the feeds way i.e one call like a feed which has pretty much all the data at startup and incorporating caching helped.

Especially when getting into device based programming we need to bring in some of the client server concepts in here this helps a ton.

The solution is here.

Samsung – Worldwide TV Management

Samsung SMART TV started as a concept now a reality, the device has a base software which needed to be updated from time to time. The cloud was the perfect solution assuming these TV’s are connected to the internet. Azure or AWS became the obvious choice not knowing which way the sales would go for Samsung, betting the update and management solutions on the Platform As A A Service was a good decision. Some Key Features

  • Frequent updates with new applications and software changes for better support and compatibility.
  • Due to high sales the need of a scalable and elastic system was required.
  • Utilized 20 large size web roles with ASP.NET.
  • Have a good Web API layer for the same functionality find the developer documentation here.
  • Caching seems the one candidate which all azure application need for rescue.

Solution Architecture

image

The Architecture is fairly straightforward.

  • Firmware Download Website- Set of web roles which connects with the Smart TV via set of REST based API post authentication will check for the updates and push the updates to the device.
  • Administration Firmware Upload Website – A set of web roles which provide basic administration and reporting functionality.
  • Worker Role  which does task automation – firmware encryption and batch updates. Additionally push the logs to the Sql Azure Database. The updates goes to a blob storage and uses the Azure CDN functionality to push to edge servers.

MYOB

MYOB(Manage Your Business) is a large Australian based ISV developing Accounting software for small business. The new release AccountRight Live, lets users run their account on a PC or a cloud. Or both at once depending on their preferences.  The hybrid arrangement was not however a cunning innovation design to catapult the company and its customer into a bold and cloudy future.  Based on surveys run with the customer did MYOB decide on rolling out a hybrid strategy. Users count about 150k.  Find the link to the site here.

Based on the multi-tenancy requirement each customer required a separate database. Caching is used as standard feature, reserved CPU per database.

MYOB Solution Architecture

  • Each user installs the client software via a box offering
  • Choice to use the business and data tier either on Azure or on premise
  • The application is developed using C#/.NET using LINQ to SQL and Entity framework. Which is very bad……. LINQ is fairly single threaded process which works very well on an Intel processor with a high clock rate. On a web, worker role AMD processor with very low clock rate, the performance on Azure will be slower. Work around is run LINQ on small CPU with a single core.
  • Database on premises and Azure are kept in sync via Sync framework
  • Each customer has their own Azure SQL Database per business entity.
  • DB to be backed up nightly using DAC Import/Export services keeping 2 days rolling backup files in blob storage.

image

Identity Management used is Azure ACS (STS inbuilt) most probably with a Custom Identity Provider with option of windows live. With some guess work I see MYOB Identity Management use ADFS to integrate with corporates Active Directory as well.

A simple layered architecture.

  • User Interfaces can be a browser, Client Desktop thick client.
  • Services Layer – This includes the Collaboration Services, Authentication Service, Customer File Service, Huxley Services (Transactional Services). All services are exposed as REST API. The Client Desktop connects with the Collab Service post authentication which uses the Azure ACS receives the ACS token. The Collaboration Services validates with the Billing System and post that connects to the correct User Database depicted as User DB1… DBn.
  • Storage Services: This is basically a Sql Azure set of databases.

What I love about this architecture is the simplicity. Keep tenancy at a database perhaps may not be the most economical solution but its simple.

Caching has been used at every layer. Judging from the speeds they seem to have a set of dedicated caching servers how many is guess work again.

MYOB Implementation had some key lessons to learn – what are they

  • Cloud Platforms
    • Enable massive scalability
    • HA at lower costs
    • Expose rich cloud based API’s
  • Identity Foundation
    • Well integrated with WCF and highly customisable Scaling Database
    • Sharding is the foundation

Issues on WCF throttling had been handled with different architectural solutions. Some of the solutions are here

Key Takeaways 

Couple of Key Areas to watch out for in Azure Application

  • If Mobile is a part of the overall architecture special considerations are required, well have a separate post on the same.
  • Study Storage choice properly Tables, Sql Azure – there is no straightforward answer it varies.Every Sql Azure database has been allocated a maximum of 180 concurrent threads
  • Use Caching wherever possible. This is an architectural decision not a developer.
  • Code Right-  Profile the code as much as possible.
  • Sql Azure is a Relational Database as a Service.SQL Server is the core engine and Sql Azure is logical abstraction over the same. SQL Azure is a subset of SQL Server features. It provides tremendous scale out features.