Sunday, 23 September 2012

Hadoop, Sql Azure the BI and Analytics Dilemma

MSFT teams in Redmond haven’t made any official statement on Analysis Service (SSAS) for Sql Azure. With Azure making big commitment on Hadoop the dilemma becomes two fold building BI & Analytical capabilities for Sql Azure and Hadoop. I’m sure countless discussions have happened in those Redmond buildings deciding the same. Some guess work around this “MSFT will release Analysis Services for Sql Azure early next year”. The Analysis Service should involve support for both Sql and No Sql.
I started with a simple application in the financial industry vertical in cloud. I soon reached a point where I was pushed to corner to make some hard decision on cloud as the application besides the transactional features required tons of BI and Analytics spread across both structured and unstructured world.
Sql Azure unwillingly became the choice of structured database and Hadoop and Azure became willing choice for unstructured data. Hadoop provide very quick and optimal search across terra bytes of data  and Sql Azure with its limited on cloud offering gave transactional. The BI and Analytical capability became a nightmare.
The MSFT teams on Sql Azure are tight lipped about an “Analysis framework in Sql Azure which can do both structured & un structured”. For right now I have a Sql 2012 Analysis Service running on Azure virtual machines which did pretty much what I wanted. But then “unwillingly I have to say SSAS of Sql 2012 is coming in Sql Azure” pure guesswork.
The application as of current looks like this
What below is my thought process pure guesswork? The shift to No Sql is evident MSFT – Azure platform has to embrace the No Sql platform completely this involves extending the BI and Analytics. What seems to be emergent is an Analysis Service architecture is expected to support both Sql and No Sql running out of Sql Azure platform. The Sql Server 2012Analysis Service -  xvelocity an in memory analytics engine is step to move into the PaaS model for Sql Azure.This new engine is delivered within the following modules
  • xVelocity for Data Warehousing: is a memory optimized columnstore index for high speed data querying (relational queries).
  • xVelocity for Business Intelligence: is the in-memory analytics engine for Analysis Services (Tabular Model) and PowerPivot.
As the name in-memory engine implies all the data is stored in memory. Although todays computer systems are equipped with gigabytes of memory, memory still is an expensive resource. Therefore we need to be able to analyze the memory usage of the Analysis Services in-memory engine to understand how much memory is consumed by the different applications. The SQL Azure Analysis Service Framework should deliver a massively parallel processing infrastructure with a software solution that embeds both SQL and MapReduce analytic processing for deeper analytic insights on multi-structured data and new analytic capabilities driven by data science. Analysis service is most likely to uses an integrated MapReduce analytics engine for embedded analytic processing, simplifying enterprise access for big data analytics. Sql Server Analysis Service formally supported only SQL so that any business intelligence tool that generates standard SQL or any business analyst that knows SQL can immediately invoke the power of data science without having to learn programming languages or new interfaces. What is expected Sql Analysis Services Framework expected to look like is image While writing my applications I have found pushing data into Hadoop Azure cumbersome. I have resorted to writing data poll mechanism which will pull data from the Azure Blob storage push to Hadoop Head Node. This is not a documented way, it works. Find the code here.      

Tuesday, 4 September 2012

Real-World Windows Azure Case Study: Hearst Newspapers- Getting Critical about it.


I’ve been following the real world examples where Azure has been deployed in volumes, with absolute regret I must state there are very few. I started reading the Hearst Newspaper case study deployed on Azure, I started digging around for some answers this what I have come as my findings.

For the complete case study find it here -

In blue is the excerpts from the case study mentioned above and in the professional red are my comments.

Setting the background Hearst Newspapers owns 15 newspapers including Houston Chronicle, San Francisco Chronicle, Albany Times Union and San Antonio Express-News.

The  FY 11 Annual Review find here

Digital News Service with Windows Azure is what Hearst was trying to implement. Hearst Newspaper in the internet world pretty much lived off no fee website and business goal was to get test subscription based premium content delivered to mobile apps.

Apparently the sentence which nails the debate is “ We also planned to start a new initiative with an app for the Apple iPad and subscription through iTunes store

The following points seem to have missed the architects eye in this solution.

  • “Azure has come out with Mobile Services recently” I’m sure as an architect one would have a foresight what’s the roadmap of Azure.
  • To support the argument further with Azure Mobile Services supporting iPad application example: , Thriving on the MVC architecture to support iOs applications using Object C. Mobile Services end points are just looking for data to come across in JSON format.
  • Customer Value Prop Missed out “Needed a solution that would tie these systems together that would orchestrate which offers were available to which users, based on their subscription status or purchase”. I think this requirement is not addressed or missed out.

The Content Management Story

At Hearst multiple content managements serve up for multiple existing news website. I don’t know the intricate details “The database of print subscriber was on another system”. This sounds an organization which has grown its IT in a very unorganized manner. I could be wrong. Content Management Systems need a master central information and rules repository for generic information & governance considering the risk the Media company are up against, and a localized repository can replicate the rules and generic information.The publishing of content to various website can happen from localized or centralized repository.

Dissecting the Requirement Statement

The requirement as stated by Hearst “ Needed a solution that would tie these systems together that would orchestrate which offers were available to which users, based on their subscription status or purchase.” <<Response> this sounds like a requirement of CRM>

The system needed to ensure that the CMS made the right content available to each user.” <<Response> this sounds like a requirement of AUDIENCE targeting>

And it would support new content systems and content delivery networks as we decided to include them<<Response> this requirement is about 1 million feet high>

The Business Requirement

Scalable Solution to meet the increasing demands of a phased rollout across our news properties and could handle traffic spikes due to news events. Given the competitive pressures we also decided that fast time to market was crucial. And we wanted our solution to be cost-effective with low capital and operating costs consistent with razor thin margins of media business


  • Scalable Solutions will depend a lot on how your systems are architected with multiple CMS and multiple news sites & multiple other application one cannot scale to infinity . Scalability largely depends on what applications have moved to cloud with a lot of applications build with third party controls and software PaaS is not a direct fitment, a hybrid looks obvious.
  • Fast Time to Market: This is a misnomer your ability to market to fast depends
    • Ability to integrate all CMS and use intelligent CRM to value added services for your customer
    • Premium content – Cross Sell, Up Sell , this is derived benefit.
  • Cost Effective: This depends on how has the Azure components been structured and have we used the Calculator well enough to include all seasonality's.
  • Low Capital and Operating Cost—Don’t know but from what I seeing Azure has a lot of hidden costs in a real deployment. The budgets are generally 30% higher than the estimated cost. >

The Azure Value Proposition as stated

To avoid the delays inherent in deploying infrastructure, we focused on cloud-based solutions. When I think of the cloud, I really think of abstracting the management of the infrastructure, so we can focus on our application, our added value.  We looked at Amazon, but we would still have had to manage our servers, install updates, and do everything we wanted to avoid. Windows Azure is more than infrastructure-as-a-service; it’s a fully managed platform-as-a-service. That’s the way we wanted to go.

When I think of the cloud, I really think of abstracting the management of the infrastructure, so we can focus on our application, our added value”-

<Response> What I’m seeing is an increasing trend in adding monitoring functionality to the application on cloud example Signal R, New Relic- This is an additional compute cost attached to this. Abstracting the management of the infrastructure cannot be achieved completely if there exists a hybrid cloud model i.e on premise, cloud (PaaS), Azure VM’s </Response>

Windows Azure is more than infrastructure-as-a-service; it’s a fully managed platform-as-a-service. That’s the way we wanted to go.

<Response> Windows Azure has a lot good features and lot is evolving a classic is mobile services. The roadmap of Azure is required and MSFT needs to be more transparent to customers, this if MSFT wants to have more customers moving to Azure.</Response>

Existing Technology Stack

“Much of the technology we use runs on the LAMP stack: Linux, Apache, MySQL, and PHP. Much of the rest runs on Microsoft technologies. For the integration solution, we chose .NET, and Windows Azure SQL Database . For a solution that runs on Windows Azure, it only makes sense to use a consistent, end-to-end technology stack.”

<Response> The integration story is far more than .NET & Windows Azure SQL Database, its likely to be Windows Service Bus as that is the single most important component which will help integrate. In addition to this, Service Bus Queues, Topic & Subscription and transaction will be a candidate. Sql Azure may perhaps work as a storage that’s kind of expensive.</Response>


Time to get the Solution Out

Built in a month, the solution is a token-based entitlement database (EDB) that interacts with the mobile app, the subscription database, and the iTunes Store.

Figure 1. The entitlement database hosted in Windows Azure mediates between Hearst’s subscription and content systems and Apple iTunes.

When the Hearst mobile app connects to the CMS to download premium content, the CMS first determines if the app is entitled to that content. If the app cannot present a token confirming a consumer’s existing print or digital subscription, or content purchase, the CMS redirects the app to the EDB to obtain one. The EDB checks the subscription system. If the consumer has a subscription that includes digital content, the EDB provides a token that the app then uses to obtain that content from the CMS. The token is secured with common secret key and hash procedures, and is completely compatible across platforms including the Microsoft .NET Framework, .PHP and Perl, as well as Hearst’s content delivery network.
If the consumer does not have a subscription, the EDB offers the options of obtaining a trial subscription through an “in-app” transaction, or purchasing content from iTunes. If the consumer chooses the former, the EDB updates the subscription system and issues the token. If the consumer chooses the latter, the EDB redirects the app to the iTunes Store to make a purchase. The app then presents the iTunes receipt to the EDB; the EDB confirms the transaction with iTunes and issues the token. Once the app has the token, it presents it to the CMS to obtain content. This process makes it possible for the EDB to incorporate the security policies of external subscription systems, should we wish to add them.

<Response> The token based entitlement database is extension to the Identity Management & Federation Services. It allows the users to access premium content via mobile based on the token. The token acts as form of authorization i.e what content is the user authorized too. A lot of the Identity & Access Management is available out of box in Azure. It would more palatable if  Service Bus Integration the value prop of Cross Sell and Up Sell via intelligent CRM will help realize the goal targeted content for audience.

What has been achieved in 1 month is more of a prototype of mobile services this is very typical of DPE division in MSFT to do this, The true value add is in doing something bigger.



The technical and business benefits you’ve seen from using Windows Azure as stated by Hearst

First, Windows Azure enabled us to meet our goals to quickly create a high-quality solution to support our first mobile app, while preserving our flexibility to include additional devices, apps, and storefronts to expand our digital market share. Instead of the time—anywhere from a week to a month—that it would have taken to deploy servers at a traditional hosting provider, we deployed its Windows Azure instances in less than a day, a time savings of between 80 and 95 percent.


<Response> Given this is a prototype a day looks good. I have personally tried deploying HPC 20 instances over the internet on Azure it takes sometime. Ideally there should be a staging environment and the swap needs to be done to move into production.


Faster testing enabled by Windows Azure helped the quality of the solution.

The tight interoperability between Windows Azure and the development process enabled us to do more testing than we normally do and a more thoroughly tested application is a more robust application. We also benefitted from the Pariveda Continuous Integration Server, which resulted in faster development and more stable code. This enables us to maintain a stronger focus on the application and on maximizing business value. You can maintain a greater development velocity. That’s what we gained by using Windows Azure.


<Response> Greater development velocity is something which needs more information.From live experience testing and debugging on windows azure is a new learning and slightly complex. The on premise emulator is good for unit testing only, any other form of testing better on staging cloud as the performance and other elements vary</Response>

Financial benefits have you seen with Windows Azure

Hosting our EDB in Windows Azure, we avoided the hundreds of thousands of dollars in capital costs and operating expenses associated with building an on-premise solution and operating it over a three-year period. We also avoided the high fees associated with a traditional hosting provider. I estimate those charges would likely have come to $10,000 per month—compared to the $2,000 per month that we incur for Windows Azure—a savings of 80 percent and, over three years, of $288,000. For Hearst subscribers, the news has to flow as readily as water or electricity. Anything less could cripple adoption and our move into mobile apps. That puts a heavy responsibility on the EDB—one that it is meeting successfully.

<Response>EDB is a very small part of the solution. Hosting EBD alone in the cloud doesn’t benefits in the long run, The long term strategy to move bigger parts to cloud would be a Benchmark statement for Microsoft </Response>


What’s next?

Our strategy is to preserve our options so we can continually explore what works best in the digital marketplace. We’re using Windows Azure as part of that strategy, and we expect to move more of our systems to it. We can grow our environment quickly and at low cost. We can swap in various subscription offers without having to put new apps through iTunes. We can add apps for other mobile devices and add storefronts—including our own websites—and just plug them all into Windows Azure. We can’t know where the future is going, but we’re sure we’re ready for it.


<Response>The customer is not too sure about Azure will they have taken there initial bets on mobile services and EDB in cloud there are not sure if they plan to use the Media Services provided by Azure and more over the architecture doesn’t looks to be well thought At the end of the day it looks like a good Prototype…….</Response>