Niraj Bhatt – Architect's Blog

Ruminations on .NET, Architecture & Design

Client Profitability vs. Practice Profitability

This post is for dummies covering few business terms which I am dabbling with these days. Thoughts below are primarily related to software services, but I think they would be of help to any service industry.

Having ran a startup earlier, I have always cared for margins which is necessary for the healthy growth of the business. Before getting into a customer engagement getting your margins or simply put profits right is very important, for both fixed bid and T&M (Time and Materials) projects. Apart from resource costs, you also need to take into consideration other costs like T&E (Travel and Expenses) and call them out separately.

Keeping above in mind, the profits you derive out of a given customer project is called Customer or Client Profitability (CP) – usually measured in terms of percentage. So, is good CP all what a company should care for? Answer is, of course not. While you might have a high CP it’s still possible that the overall company or practice is making loss. Let’s see how.

The common reason for discrepancy here is overlooking the fixed costs. For instance, you are going to incur salary costs irrespective of whether your resources are allocated (billable) to a project or not (e.g. project you signed up for got over in 5 months) and you will have to still pay rent, infrastructure bills, etc. All of these expenses fall under the larger category called SG&A (Selling, General and Administrative Expenses) which includes advertisement, sales, taxes, training, corporate functions, etc. In short, the Practice Profitability (PP) is not a sum of various CPs; rather it’s Sum of CPs minus SG&A.

It should be clear by now that the only way you can grow your business is increment CP without proportionally increasing SG&A; i.e. do more with less. Most of the budget planning exercises in corporate companies is around this agenda. One way to achieve this is move away from RFR (Resource following Revenue) to non-linear revenue models, shifting the focus from services to products.

Hope this was useful in putting these terms into the right perspective.

Overview of Office 365

Office 365 is suite of Microsoft products delivered software as a service from cloud. For consumers it represents a simplified pay as you go model, helping them use office products across multiple devices while for the enterprises the value proposition is workplace transformation by driving Enterprise Mobility.

Consumers can now pay a monthly subscription fee and have the word, excel and other office tools installed across 5 PCs and Macs. Users also get 5 more mobile office installs for Android and iOS platforms and there is a feature available called Office on demand which allows users to temporarily stream office 2013 applications on a windows 7 / 8 PC. In addition, one gets 20 GB of SkyDrive integrated with Office Web Apps (a subset of desktop version) and 60 Skype world minutes to make calls in over 60 countries.


Enterprises, on the other hand, are being disrupted by various needs of geographically distributed teams, decentralized work locations, BYOD and data security, social engagement platforms, etc. Office 365 for enterprise, adds additional hosted services like Exchange, Lync, SharePoint, Yammer, SkyDrive Pro, etc. to cater to these needs. These services can be accessed using Single Sign On with an on premise AD / ADFS. What’s more, with SaaS model you take the entire IT complexity and management out of the equation.

Office 365 also has something for developers. The developer subscription which is bundled free with MSDN subscription or otherwise costs 99 USD, allows developers to build applications for Office 365 including SharePoint Online. These applications typically enhance office tools – for instance an enterprise can develop set of applications for their employees and avail them under my organization section of the portal. Developers can do application development using familiar development tools. For small enterprises, which want an easy way to augment the OOB office functionality, office team offers “NAPA” – office 365 development tools right of your browser. In addition to this, enterprise developers can also use Visual Studio. ISVs planning to develop commercial applications, can publish their applications to the office store.

Using a Single Windows Azure Active Directory tenant for All EA Azure Subscriptions

As you know by now Windows Azure Active Directory is at the root of every Azure subscription.


But in an EA setup you typically have multiple subscriptions and you definitely don’t want to create a different WAAD tenant for every other subscription. So here’s what you can do (there might be other ways too of achieving this). You can first create a Shared account and under that a Shared Subscription. Also create the WAAD tenant you want to use and ensure your shared subscription is under that WAAD tenant. In that WAAD tenant create all the account administrators.


Now go to your EA portal, and add new accounts specifying the account administrators you just created. That’s it – next when you create subscriptions for those newly created accounts, these subscriptions will be by default part of the same WAAD tenant under which you created your shared subscription.


It can’t get any easier, isn’t it :) ?

Windows Azure Portals and Access Levels

When you sign up for Windows Azure you get a subscription and you are made the Service administrator of that subscription.


While this creates a simple access model, things do get little complicated in an Enterprise where users need various levels of access. This blog post would help you understand these access levels. 

Enterprise Administrator
Enterprise Administrator has the ability to add or associate Accounts to the Enrollment and can view usage data across all Accounts. There is no limit to the number of Enterprise Administrators on an Enrollment.
Typical Audience: CIO, CTO, IT Director
URL to GO:

Account Owner
Account Owner can add Subscriptions for their Account, update the Service Administrator and Co-Administrator for an individual Subscription, and can view usage data for their Account. By default all subscriptions are named as ‘Enterprise’ on creation. You can edit the name post creation in the account portal. Under EA usage, only Account Administrators can sign up for Preview features. Recommendation for accounts to be created is either on functional, business or geographic divisions, though creating a hierarchy of accounts would help larger organizations.
Typical Audience: Business Heads, IT Divisional Heads
URL to GO:

Service Administrator
Service Administrator and up to nine Co-Administrators per Subscription have the ability to access and manage Subscriptions and development projects within the Azure Management Portal. The Service Administrator does not have access to the Enterprise Portal unless they also have one of the other two roles. It’s recommended to create separate subscriptions for Development and Production, with production having strict restricted access.
Typical Audience: Project Manager, IT Operations
URL to GO:

Subscription co-administrators can perform all tasks that the service administrator for the subscription can perform. A co-administrator cannot remove the service administrator from a subscription. The service administrator and co-administrators for a subscription can add or remove co-administrators from the subscription.
Typical Audience: Test Manager, Technical Architect, Build Manager
URL to GO:

That’s it! With above know-how you can create an EA Setup like below


Hope this helps :)

Azure Benefits for MSDN subscribers

Friends, hope you are aware of this great offer. Click on the image below to sign up :)


Windows Azure vs. vs. Cloud Foundry

Below is a brief write up of some personal views. Let me know your thoughts.

Windows Azure is the premier cloud offering from Microsoft. It has a comprehensive set of platform services ranging from IaaS to Paas to SaaS. This is a great value proposition for many enterprises looking to migrate to cloud in a phased manner; first move as-is with IaaS and then evolve to PaaS. In addition, Azure has deep integration across Microsoft products –including SharePoint, SQL Server, Dynamics CRM, TFS, etc. This translates to aligned cloud roadmap, committed product support and license portability. Though .NET is the primary development environment for Azure platform, most of the Azure services are exposed as REST APIs. There are JAVA, Ruby and other SDKs available which allows variety of developers to easily leverage Azure platform. Azure also allows customers to spawn Linux VMs, though that’s limited to IaaS offerings. allows enterprises to extend – the CRM from SalesForce. Instead of just providing SDKs and APIs, Salesforce has created as a PaaS platform – so that you focus only on building extensions; rest is managed by Salesforce. Salesforce also provides a marketplace ‘AppExchange’ where companies can sell these extensions to potential customers. Though offers an accelerated development platform (abstracting many programming aspects), programmers still need to learn APEX programming language and related constructs. Some enterprises are considering as their de-facto programming platform – taking it beyond the world of CRM. It’s important to understand the applicability of for such scenarios would typically be limited to transactional business applications. So, where should enterprises go when they need to develop custom applications with different programming stacks and custom frameworks? Salesforce answer is Heroku. Heroku supports all the major programming platforms including Ruby, Node.js, JAVA, etc. with exception of .NET. Heroku uses Debian and Ubuntu as the base operating system.

Many enterprises today are contemplating their move to PaaS cloud citing vendor lock-in. For instance, if they move to Azure PaaS platform their applications would run only on Azure, and they would have to remediate them to port to AWS. It would definitely be great to have a PaaS platform agnostic of a vendor. This is the idea behind open source PaaS platform Cloud Foundry. It’s an effort co-funded by VMware and EMC. VMware offers a Cloud Foundry hosted solution, with the underlying infrastructure being vCloud. Cloud Foundry supports various programming languages like Java, Ruby, Node.js, etc. and frameworks like MySQL, MongoDB, RabbitMQ among others. VMware also offers vFabric, a PaaS platform focused on JAVA spring framework. vFabric is an integrated product with VMWare infrastructure, providing a suite of offerings around Runtime, Data Management and Operations. I feel future of vFabric is likely to depend on the industry adoption of Cloud Foundry (there is also another open source PaaS effort being carried out by Red Hat called OpenShift).

Overview of VMware Cloud Platform

Continuing my discussion on major Cloud Platforms, in this post I will talk about VMware (subsidiary of EMC) – one of the companies that pioneered the era of virtualization. Flagship product of VMware is ESX (VSphere being product, which bundles ESX with vCenter) a hypervisor that runs directly on the hardware (bare metal). As you would expect, VMware is major player in private cloud and data center space. It also has a public IaaS (Infrastructure as a Service) cloud offering and also supports an open source PaaS platform (understandably no SaaS offerings). Below is a quick overview of VMware offerings.

Private CloudvCloud Suite is an end-to-end solution from VMware for creating and managing your own private cloud. The solution has two major components – Cloud Infrastructure and Cloud Management. Cloud Infrastructure components include VMware products like vSphere (cloud OS controlling the underlying infrastructure) and vCloud Director (multitenant self-service portal for provisioning VM instances based on vApp Templates), while Cloud Management consists of operational products like vCenter (centralized extensible platform for managing infrastructure) among others. There are also vCloud SDKs available which you can use to customize the platform to specific business requirements. Also, with last year acquisition of DynamicOps (now called vCloud Automation Center) VMware is extending its product support to other hypervisors in the market. Other vendors too like Microsoft are evolving with similar offerings with Hyper-V, System Center, SPF and Windows Azure Services. It’s important to note though, quite a few enterprises operate a private cloud like setup using VSphere alone and build custom periphery around it as necessary.

Public Cloud – In case you don’t have budget to setup your own datacenter or are looking to build a hybrid approach which helps you do a cloud burst for specific use cases, you can leverage VMware’s vCloud Hybrid Service (AKA vCHS). The benefit here is migration and operation remains seamless, as you would use the same tools (and seamlessly extend your processes) that were being used for in-house Private Clouds.

PaaS Cloud – VMware has a PaaS offering for private clouds called vFabric. vFabric application platform contains various products focused on JAVA Spring Framework stack. Architects can create a deployment topology using drag and drop for their multi-tier applications. Not only they can automate the provisioning, but also scale their applications in accordance with business demand. In addition, VMware is also funding an open source PaaS platform called Cloud Foundry (CF). The value proposition here is you can move this platform to any IaaS vendor (vCloud, OpenStack, etc.), so when you switch between cloud vendors you don’t have to modify your applications. This is contrary to other PaaS offerings which are tied to the underlying infrastructure – e.g. application ready for Azure PaaS would have to undergo remediation to be hosted on Google PaaS. Also, being open source you can customize the CF platform to suite your needs (there is similar effort being carried out by Red Hat called OpenShift).

Finally, you might hear the term vBlock (or vBlock Systems) in context of VMware. VCE (Virtual Computing Environment) – the company which manufactures vBlock Systems was formed by collaboration of Cisco, EMC and VMWare. These vBlock systems racks contain Cisco’s servers & switches, EMC’s storage and VMware virtualization. There are quite a few service providers using vBlock, to create their own set of cloud offerings and services.

Hope this helps!

Overview of Google Cloud Platform

In next few posts, I will try to give a brief overview of major Cloud Computing platforms. As I started writing this post, it reminded me of an incident. Few years back I was chatting with a Microsoft Architect. He proudly told me that if Google were to shut tomorrow, none of the enterprises would care about it. Well, since then things have changed. From a provider of search engine, email and mobile platform (Android), Google has made it ways into enterprises. To add another experience, recently I was visiting a fortune customer and saw one of the account managers using Gmail. While my first reaction was he shouldn’t be checking his personal emails at work (we were discussing something important), he, in fact, was replying to an official email. I learned from him that they were among the early adopters of Google Apps. With those interesting anecdotes, below is quick overview of Google cloud platform.

Google Apps – You can think of Google Apps as a SaaS offering more on the lines of Microsoft Office 365. It includes Gmail, Google Calendar, Docs, Sites, Videos, etc. Value proposition is – you can customize these services under a domain name (i.e. white label). Google charges per user monthly fee for these services (this fee is applicable to Google Apps for Business; Google also offers a free version for educational institutions under brand Google Apps for Education). In addition, Google has created a market place (Google Apps Marketplace), where organizations can buy third party software (partner ecosystem) which further extends Google Apps. As you would expect, Google also provides infrastructure and APIs for third party software developers.

Google Compute Engine – GCE is the IaaS offering of Google. Interestingly, it offers sub hour billing calculated at minute level with minimum of 10 minutes. For now only Linux images / VMs are supported. Here’s a Hello World to get started with GCE. Note that you need to setup your billing profile to get started with GCE.

Google App Engine – GAE is an ideal platform to create applications for Google Apps Marketplace. A PaaS offering from Google – easy to scale as your traffic and data grows. Like Microsoft’s Windows Azure Web Sites, you can serve your app from a custom domain or use a free name on domain. You can write your applications using JAVA, Python, PHP or Go. You can download respective SDKs from here along with a plugin for Eclipse (SDKs come with an emulator to simplify development experience). With App Engine you are allowed to register up to 10 applications per account – and all applications can use up to 1 GB of storage and enough CPU and bandwidth to support an application serving around 5 million page views a month at no cost. Developers can also use NoSQL (App Engine Datastore) and relational (Google Cloud SQL) stores for storing their applications data. Google Cloud Storage a similar offering to Windows Azure Blob Storage, allows you to store files and objects up to terabytes in size. App Engine also provides additional services such as URL Fetch, Mail, Memcache, Image Manipulation, etc. to help perform common application tasks.

Google BigQuery – BigQuery is an analytic tool for querying massive datasets. All you need to do is move your dataset to Google’s infrastructure. After that, you can query data using SQL-like queries. These queries can be executed using a browser or command line or even from your application by making calls to BigQuery REST API (client libraries are available for Java, PHP and Python).

So, in a nutshell these are the major offerings of Google Cloud platform encompassing SaaS, PaaS and IaaS. Google Apps appears to be the most widely used of all offerings, with Google claiming more than 5 million businesses running on it.

Hope you found this overview useful.


While talking of IT Service Continuity planning, an IT aspect of Business continuity planning, terms RTO and RPO have become a common place. While both terms can have different meaning depending on the context, for IT they largely represent acceptable downtime or time to recover IT operations to normal. Below is a brief overview.

RTO – Recovery Time Objective is permissible system downtime after a breakdown event. If downtime exceeds this limit, it’s bound to cause impact to the business (most likely financially). RPO – Recovery Point Objective is the permissible time of data loss during a failover. Though RPO is an involved term, for simplistic example consider a RPO limit of 2 hours set by company X – this could translate that during a disaster event when the secondary site is activated the data loss (sync window) between primary and secondary shouldn’t be more than 2 hours.

Normally, there isn’t one RTO and RPO for a given organization, rather is different and attributed to the service / system in context. Systems with aggressive RTO / RPO are costlier to run compared to the ones with relaxed guidelines. Most enterprises mandate SLAs around RTO / RPO from their service providers. Also, If your primary focus is just around databases you can pick up one of these approaches. Please leave your comments below with additional thoughts on this topic.

Big Data, NoSQL and MapReduce

Consider a hypothetical scenario. Your company has got the project to design a new website for the channel airing IPL (if you haven’t heard of IPL, just pick up any sport you love). The channel wants to create this new website when users can register and create their own discussion rooms for discussing a specific aspect of match or a specific player or anything else. You have been assigned as a lead architect on this project. Among other challenges, you are having nightmares thinking about non-functional requirements (NFRs) that are to be met for this project (your competition was fired, as their traditional 3 tier architecture wasn’t holding up). You know you got to do something different, but not sure exactly what and how. If this resonates with you, keep reading.

Big Data – As name suggests Big Data is about huge and fast growing data, though how huge and how fast is left to one’s discretion. Big Data initially attributed to search engines and social networks is now making its way into enterprises. Primary challenges while working with Big Data are – how to store it and how to process it. There are other challenges too like visualization and data capture itself, but for this post I will omit them. Let’s start with storage first, by understanding NoSQL.

NoSQL is an umbrella term for non-relational databases which don’t use SQL (Structured Query Language). NoSQL databases unlike relational databases are designed to scale horizontally and can be hosted on a cluster. Most of these databases are key value stores (Riak) where each row is a key value pair. The important thing to note here is the value doesn’t have a fix schema; it can be anything – a user or a user profile or an entire discussion. There are two major variants of key value databases – document database (MongoDB) and column-family database (Cassandra). Both of them extend the basic premise of key-value store to allow easy search on data contained inside value object. Document store database imposes a structure on the value stored allowing query on internal fields. On the other hand column-family database stores value across multiple column value pairs (you can also think of it as second level key value pair) and then group them into a coherent unit called column-family.

Before you think you have found the storage panacea and ready to go, you need to take care of few important aspects related to distributed databases – scalability, availability, and consistency.

First is scaling via sharding to meet your data volume. Good part is most of NoSQL database support auto sharding which means shards are automatically balanced across the nodes on a cluster. You can also add additional nodes as necessary to your cluster, to align with data volume. But what if a node goes down? How can we still make the shards available? We need to mitigate these failures by making our system highly available.

Availability can be achieved via replication. You can setup a master slave replication or peer-to-peer replication. With master slave replication you should typically setup three nodes including master and all the writes go to the master node. Data reads though can happen from any node, either a master or a slave. If a master node goes down, the slave gets promoted to master, and continues to replicate to the third node. When failed master node resurrects it joins the cluster as a slave. In contrast, peer-to-peer replication is slightly complex. Here, unlike Master / Slave all the nodes receive read / write requests. The shards are now replicated bidirectional. While this looks good just remember when we use replication we will run into consistency issues due to latency.

There are two major types of inconsistencies – read and write. Read inconsistencies will arise in master / slave replication when you try to read of a slave before changes propagate from master. While in peer-to-peer replication you will run into both read and write inconsistencies, as write (update) is allowed on multiple nodes (think of two people trying to book movie tickets at the same time). As you would have observed availability and consistency are in contrast to each other (check out CAP theorem for more details). What’s the right balance is purely contextual. For instance you can prohibit reads and writes inconsistencies – just have slaves as hot standby; don’t read of them.

Let’s now see how you can process Bigdata – the compute aspect. Processing massive amount of data needs a shift from the client server model of data processing wherein client pulls the data from server. Instead the emphasis is to run processing on the cluster nodes where data is present by pushing the code. In addition, this processing can be carried out independently in parallel as the underlying data is already partitioned across nodes. This way of processing is referred to as MapReduce pattern and it also interestingly uses key-value pairs.

Extending our IPL example, consider you want to list the top players being discussed across all the forums. This would mean you need to iterate through each discussion in our NoSQL store and then identify each player occurrences. Applying MapReduce here, we start with the map function. A single discussion (key-value pair) would be an input to the map function, which would result into a key value pairs output, with key being player name and value indicating number of occurrences. All the occurrences (values) for a given player (key) across nodes are then passed to a reduce function for aggregation.

Most MapReduce frameworks allow you to control the number of mappers and reducers instances, using configurations. While reduce functions normally operate on a single key, there is also a concept of partition function which allows you to send multiple keys to a single reducer, helping you evenly distribute the load across reducers. Finally, as you would have guessed mappers and reducers could be running on different nodes, and this would need map output being moved across to reducers. To minimize these data movements, you can introduce combiners, which perform a local reducing job – in our case all the player occurrences can be aggregated at the node level before passing it on to the reducer. Most of NoSQL databases have their own way of abstracting / implementing MapReduce via queries and others. You can also use Hadoop and related technologies like HDFS for your MapReduce workload without using NoSQL databases.


Hope this overview has helped you understand the big picture of how these technologies fit together.


Get every new post delivered to your Inbox.

Join 188 other followers