Niraj Bhatt – Architect's Blog

Ruminations on .NET, Architecture & Design

Category Archives: Food For Thought

Evolution of software architecture

Software architecture has been an evolutionary discipline, starting with monolithic mainframes to recent microservices. It’s easy to understand these software architectures from an evolution standpoint, rather than trying to grasp them independently. This post is about that evolution. Let’s start with mainframes.

The mainframe era was of expensive hardware, having powerful server capable of processing large amount instructions and client connecting to it via dumb terminals. This evolved as hardware became cheaper and dumb terminals paved way to smart terminals. These smart terminals had reasonable processing power leading to a client server models. There are many variations of client server models around how much processing a client should do versus the server. For instance, client could all the processing having server just act as centralized data repository. The primary challenge with that approach was maintenance and pushing client side updates to all the users. This led to using browser clients where the UI is essentially rendered from server in response to a HTTP request from browser.

Mainframe Client Server

As server started having multiple responsibilities in this new world like serving UI, processing transactions, storing data and others, architects broke the complexity by grouping these responsibilities into logical layers – UI Layer, Business Layer, Data Layer, etc. Specific products emerged to support this layers like Web Servers, Database servers, etc. Depending on the complexity these layers were physically separated into tier. The word tier indicates a physical separation where Web Server, Database server and business processing components run on their own machines.

3 Tier Architecture

With layers and tiers around, the next big thing was how do we structure them what are the ideal dependencies to have across these layers, so that we can manage change better? Many architecture styles showed up as recommended practice most notably Hexagonal (ports and adapters) architecture and Onion architecture. These styles were aimed to support development approaches like Domain Driven Design (DDD), Test Driven Development (TDD), and Behavior Driven Development (BDD). Themes behind these styles and approaches were to isolate the business logic, the core of your system from everything else. Not having your business dependent on UI, database, Web Services, etc. allows for more participation of business teams, simplifies change management, minimizes dependencies, and make the software easily testable.

HexagonalOnion

Next challenge was scale. As compute became cheaper, technology became a way of life causing disruption challenging the status quo of established players across industries. The problems are different, we are no longer talking of apps that are internal to an organization or mainframes where users are ok with longer wait times. We are talking of global user base with sub second response time. Simpler approach to scale was better hardware (scale up) or more hardware (scale out).  Better hardware is simple but expensive and more hardware is affordable but complex. More hardware meant your app would run on multiple machines, and the user data would be distributed across these machines. This leads us to famous CAP (Consistency, Availability and Partition tolerance) theorem. While there are many articles on CAP, essentially it boils down to – network partitions are unavoidable and we have to accept them. This is going to require is to choose between availability and consistency. You can choose to be available and return stale data to be eventually consistent or you can choose to be strongly consistent and give up on availability (i.e. returning error for the missing data – e.g. you are trying to read from a different node than where you wrote your data). Traditional Database servers are consistent and available (CA) with no tolerance for partition (active DB server catering to all requests). Then there are NoSQL databases with master slave relationship, configurable to support strong consistency or eventual consistency.

CAP Theorem

Apart from scale challenges, today’s technology systems have to often deal with contention. E.g. selecting an airline seat, or a high discount product that everyone wants to buy on a Black Friday. As multitude of users are trying to get access to the same piece of data, it leads to contention. Scaling can’t solve this contention, it can only make it more worse (imagine having multiple records of the same product inventory within your system). This led to specific architecture styles like CQRS (Command Query Responsibility Segregation) & Event Sourcing. CQRS in its simple terms is about separating writes (command) from reads (query). With writes and reads having separate stores and model both can be optimally designed. Write stores in such scenarios typically use Event Sourcing to capture entity state for each transaction. Those transactions are then played back to the read store to make write and read eventually consistent. This model of being eventually consistent would have some implications and needs to be worked with business to keep the customer experience intact. E.g. Banana Republic recently allowed me to order an item. They took my money and later during fulfillment they realized they were out of stock (that is when things became eventually consistent). Now they refunded my money, sent me a sorry email and allowed me 10% discount on my next purchase to value me as a customer. As you would see CQRS and Event Sourcing come with their own set of tradeoffs. They should be used wisely for specific scenarios rather than an overarching style.

CQRS

Armed with above knowledge, you are now probably thinking can we use these different architecture styles within a single system system? For instance, have parts of your system use 2-tier other use CQRS and other use Hexagonal architecture. While this might sound counterproductive, it actually isn’t. I remember building a system for healthcare providers, where every use case was so different – appointments, patient registration, health monitoring, etc. Using a single architecture style across the system was definitely not helping us. Enter Microservices. The microservices architecture style recommends to break your system into a set of services. Each service can then be architected independently, scaled independently, and deployed independently. In a way, you are now dealing with vertical slices of your layers. Having these slices evolve independently, will allow you to adopt a style that’s more befitting to the slice in context. You might ask, while this makes sense for the architecture, but won’t you just have more infrastructure to provision, more units to deploy and more stuff to manage? You are right, and what really makes Microservices feasible is the agility ecosystem comprising of cloud, DevOps, and continuous delivery. They bring automation and sophistication to your development processes.

Microservices

So does this evolution make sense? Are there any gaps in here which I could fill? As always, will look forward for your comments.

Image Credits: Hexagonal Architecture, Microservices Architecture, 3-Tier architecture, client server architecture, CQRS architecture, Onion architecture, CAP Theorem

Advertisements

Dependency Inversion, Dependency Injection, DI Containers and Service Locator

Here is a post I have been planning to do for years 🙂 . Yes, I am talking about Dependency Injection, which by this time, has most likely made its way into your code base. I got to recollect few thoughts around it in a recent discussion and hence writing them down.

Dependencies are the common norm of object oriented programming, helping us to adhere to software principles like Single Responsibility, Encapsulation, etc. Instead of establishing dependencies using direct references between classes or components, they are better managed through an abstraction. Most of the GOF patterns are based around this principle which is commonly referred as ‘Dependency Inversion’. Using Dependency inversion principle we can create flexible Object Oriented designs, making our code base reusable and maintainable.

To further enhance value proposition of the dependency inversion, you can pass the dependencies via a constructor or a property from the root of your application, instead of instantiating them within your class. This would allow you to mock / stub your dependency and make your code easily testable (read unit testing).

E.g.

IA ob = new A(); //Direct instantiation
public D (IA ob) { … } //constructor injection
public IA ob { get; set } //property injection – create object and set the property

Entire software industry sees value in above approach and most people refer to this entire structure (pattern) as Dependency Injection. But beyond this things get little murkier.

Confusion starts for programmers who see above as a standard way of programming. To them Dependency Injection (DI) is the automated way of injecting dependencies using a DI container (DI container at times is also referred as IoC (Inversion of Control) container. As Martin Fowler points out in his classic article, IoC is a generic term used in quite a few cases – e.g. the callback approach of Win32 programming model. In order to avoid confusion and keep this approach discernible, the term DI was coined. In this case, IoC refers to inversion of dependencies creation, which are created outside of the dependent class and then injected.) If you are surprised by this statement, let me tell you 90% of people who have walked to me to ask if I was using DI, wanted to know the about the DI container and my experience with it.

So what the heck is this ‘DI container’? A DI container is a framework which can identify constructor arguments or properties on the objects being created and automatically inject them as part of object creation. Coming from the .NET world, the containers I use include StructureMap, Autofac and Unity. DI containers can be wired up using few lines of code at the starting of your program or you can even specify configuration in a XML file. Beyond that, containers are transparent to the rest of your code base. Most containers also provide AOP (Aspect Oriented Programming) functionality and its variants. This allows you to bundle cross cutting concerns like database transactions, logging, caching, etc. as aspects and avoid boilerplate code throughout the system (I have written a CodeProject article on those lines). Before you feel I am over simplifying things, let me state if you haven’t worked with DI containers in past you are likely to be faced with a learning curve. As is the case with most of the other frameworks, a pilot is strongly recommended. As a side note, the preferred injection rule is – unless your constructor requires too many parameters (ensure you haven’t violated SRP), you should resort to constructor injection and avoid property injection (see fowler’s article for a detailed comparison between the two injection types).

Finally, let’s talk about Service Locator an alternative to DI. A Service Locator holds all the services (dependencies) required by your system (in code or using a configuration file) and returns a specific service instance on request. Service Locator can come in handy for scenarios where DI container is not compatible with a given framework (e.g. WCF, ASP.NET Web APIs) or you want more control over the object creation (e.g. create the object late in the cycle). While you can mock service locator, mocking it would be little cumbersome when compared to DI. Service Locator is generally seen as an anti-pattern in DI world. Interestingly, most DI containers offer APIs which can allow us to use them as Service Locator (lookup for Resolve / GetInstance methods on the container).

Below sample shows you StructureMap – a DI container, in action (use NuGet to add StructureMap dependency).

class Program
{
static void Main(string[] args)
{
var container = new Container(registry =>
{
registry.Scan(x =>
{
x.TheCallingAssembly();
x.WithDefaultConventions();
});
});
var ob = container.GetInstance<DependentClass>();
ob.CallDummy();
}
}

public interface IDependency
{
void Dummy();
}

public class Dependency : IDependency
{
public void Dummy()
{
Console.WriteLine("Hi There!");
}
}

public class DependentClass
{
private readonly IDependency _dep;
public DependentClass(IDependency ob)
{
_dep = ob;
}
public void CallDummy()
{
_dep.Dummy();
}
}

I will try to post some subtle issues around DI containers in future. I hope above helps anyone looking for a quick start on DI and associated terms 🙂 .

Pivot, PivotTable and PowerPivot

OK, I am not a data guy. Or let me say I am not an Excel (MS-EXCEL) guy. I have always believed that if you are good at PowerPoint (along with Visio) you probably are an architect and if you are good at Excel you are most likely to be a manager 🙂 . But not knowing Excel well, has always given me a feeling of – “something is missing”. After all, it’s the same data, but the way few people present it, they just make it look so good. While it takes time to develop those skills, understanding Pivot and related features is usually a good step in that direction.

Excel primarily is a spreadsheet tool in which you can dump some data. At times though, data is not everything. For e.g. your retail transactional system captures the sales across product lines, but if a manager needs to predict his stocks for next fiscal based on historic sales, he got to make some sense out of that data. That’s where you got to Pivot the data. The word ‘Pivot’ as Wiki generally describes it – is center point of any rotational system. You have seen them in action while helping your kids on a seesaw. Pivot table on other hand is a data summarization tool found in spreadsheet softwares like Excel. If you have to bridge this tool and the word ‘Pivot’ – you can think you are bending your data to a specific viewpoint using functions like sort, total, average, count, etc. Let’s see it with help of a trivial example. Consider below the sample data in Excel. To bend it to monthly stock view you can click on Insert tab and insert a Pivot Table. This would give you an empty pivot table and a PivotTable field list. Drag and drop your fields to specific areas to bend your data to a viewpoint (N.B. you can have multiple fields for each area and control them via expand / collapse buttons and all your charting knowledge works as is with Pivot tables).

PowerPivot as the name suggests elevates the Pivot table feature to next level making it full-fledged self-service BI tool. An Excel Add-In, PowerPivot connects to almost any external data sources including SQL Server Analysis Services. Once fetched, data stays inside the workbook and using excellent compression techniques the file size on the disk is still relatively small. You can share your PowerPivot workbooks via SharePoint and configure refresh cycles for your workbook to ensure your team always makes their decisions based on recent data. The experience of pivoting though remains same as shown earlier.

Hope that helps 🙂 .

What is DMZ?

A very brief introduction. DMZ is an element which most of architects miss out in their deployment architectures (except few who run their designs through IT pros). The word stands for “Demilitarized Zone”, an area often found on perimeter (outside) of a country’s border. These areas are typically not guarded (under treaties between two or more countries).

In IT domain we refer to DMZ as a separate network. So, why to create separate networks or DMZs? Simplest answer is to enhance security. For example, consider hosting the public facing websites of your business. You might want to host these sites inside a DMZ, separate from your corporate network. In case security of your site is compromised, your corporate network is still safe. Many architects also prefer hosting web and database servers under different DMZs to ensure there is no comprise on data, in case a hacker breaks into their web servers. Like elsewhere, you can use routers to transfer data to DMZ networks. While DMZ is separate network, you must have enough defense packed into it. An enterprise firewall (with redundancy, of course) is a minimum recommendation. Enterprises are also known to have multiple firewalls for each of their DMZs / networks. Below is a simplistic diagram of DMZ deployment.

Hope this helps 🙂 !

ConnectionTimeout vs CommandTimeout

ConnectionTimeout a property of Connection class in ADO.NET, is the time you would wait, for connecting to a given database, before flagging a connection failure. Default value is 30 seconds.

new SqlConnection().ConnectionTimeout = 10;

CommandTimeout a property of the Command class in ADO.NET, is the time you would wait, for a command (query, stored procedure, etc.) to return result set, before flagging an execution failure. Default value for CommandTimeout too is 30 seconds.

new SqlConnection().CreateCommand().CommandTimeout = 10;

Unlike ConnectionTimeout which are part of connection string, command timeouts are defined separately (hardcoded or as appSettings). Setting both of them to 0 (zero) would result in indefinite wait period which is generally considered a bad practice. You ideally want to set both of them to in accordance to your performance SLAs. Though at times, while running data heavy background jobs / workflows you may want to set your CommandTimeout to a larger value.

Association vs. Dependency vs. Aggregation vs. Composition

This might sound like a preliminary topic but recently I had a healthy discussion around them and thought of sharing few thoughts here. The description below is in context with Class Diagrams and this blog post uses UML to express the relationship in form a diagram.

Association is reference based relationship between two classes. Here a class A holds a class level reference to class B. Association can be represented by a line between these classes with an arrow indicating the navigation direction. In case arrow is on the both sides, association has bidirectional navigation.

class Asset { ... }
class Player {
Asset asset;
public Player(Assest purchasedAsset) { ... } /*Set the asset via Constructor or a setter*/
}

Dependency is often confused as Association. Dependency is normally created when you receive a reference to a class as part of a particular operation / method. Dependency indicates that you may invoke one of the APIs of the received class reference and any modification to that class may break your class as well. Dependency is represented by a dashed arrow starting from the dependent class to its dependency. Multiplicity normally doesn’t make sense on a Dependency.

class Die { public void Roll() { ... } }
class Player
{
public void TakeTurn(Die die) /*Look ma, I am dependent on Die and it's Roll method to do my work*/
{ die.Roll(); ... }
}

Aggregation is same as association and is often seen as redundant relationship. A common perception is that aggregation represents one-to-many / many-to-many / part-whole relationships (i.e. higher multiplicity), which of course can be represented by via association too (hence the redundancy). As aggregation doesn’t convey anything more effective about a software design than an association, there is no separate UML representation for it (though some developers use a hollow diamond to indicate aggregation). You can give aggregation a miss unless you use it to convey something special.

class Asset { ... }
class Player {
List assets;
public void AddAsset(Asset newlyPurchasedAsset) { assets.Add(newlyPurchasedAssest); ... }
...
}

Composition relates to instance creational responsibility. When class B is composed by class A, class A instance owns the creation or controls lifetime of instance of class B. Needless to say when class instance A instance is destructed (garbage collected), class B instance would meet the same fate. Composition is usually indicated by line connecting two classes with addition of a solid diamond at end of the class who owns the creational responsibility. It’s also a perceived wrong notion that composition is implemented as nested classes. Composition binds lifetime of a specific instance for a given class, while class itself may be accessible by other parts of the system.

public class Piece { ... }
public class Player
{
Piece piece = new Piece(); /*Player owns the responsibility of creating the Piece*/
...
}

Though elementary, I hope above distills your thinking 🙂

Setting Auto Commit Off SQL Server

In this post I will share a useful tip for SQL Server. While working with a live database, turning Auto Commit Off / Implicit transactions ON can be quite handy, especially for mitigating human errors. This is often the case when you make an unintentional change to your database (I have seen this even in tightly controlled environments) and your only options are to restore a database backup or rollback your changes. Rollback being relatively easy option can be put in place via implict transactions (though by no means it would mitigate all human errors). Implicit transactions can be turned on via simple T-SQL statement or it can be done across all sessions by modifying SQL Server Management Studio Options (Tools -> Options) as shown below

T-SQL:
SET IMPLICIT_TRANSACTIONS ON;

SSMS:

Best Practices for Versioning Builds

Build numbers are always a topic of discussion. Standard convention for a four part build number consists of Major, Minor, Build and Revision numbers. It’s quite common that the build number for internal release to QA varies from the one released on the field. For e.g. the internal release number could be 3.5.20.2 but version number for the final build (often called release build) given to the field would be 3.5.0.0. This shows that many people start with field version when they branch the baseline for a new release, increment the build number for every QA release and increment revision number for every nightly build. An interesting alternative I have come across is to start with previous version build number while branching for new release and then increment it. E.g. while working on 3.5 release start your QA builds with 3.4.xx.xx and increment it till, you the final build with 3.5. Later eases the communication with internal stakeholders.

Coming to numbers themselves Major and Minor numbers are best left for marketing guys / stakeholders. After all somebody is responsible for getting funds for your next release and it’s these guys that would get it. For remaining two numbers, I don’t see a point in why field would be interested in knowing how many QA releases you did or how many nightly builds were done. So, here is your opportunity to do something creative with those 2 numbers. One of the innovations I have come across here is to use the 3rd number to indicate changes in public API’s that field was dependent upon (though by all means you should avoid changing a published API). If this number changes it’s a signal to field developers to rebuild their assemblies. 4th number can be used to indicate the hotfixes that were done to the release. To summarize, when you a release 3.5.1.3 this release modifies the public API’s and it’s the 4th hotfix since version 3.5.0.0 was released.

So, how you deal with your build numbers?

Food For Thought – Do you take your bugs to stretcher directly?

A critical bug comes to your support team. They try reproducing it and are unable to do so. Business guys get impatient. They escalate issue to the group President. You get a call at midnight from your boss. He tells that he is on a fire call with his boss and his boss’s boss. Your credibility is at stake.

You try understanding the error. Error turns out to be the ubiquitous one – “Object Reference not set to an instance of a Object”. You pull out the source code but don’t find any valid reason for the error. Debug builds are rolled out, PDBs are shipped, Logging is all over the place. A team member leaves for the site visit hoping to find something relevant. You start flipping through Release It, start browsing through Tess’s Blog entries – hoping to get some breakthrough. You start preparing your arsenal – WinDbg + SOS, Memory Profilers, Fiddler, Wireshark, CLR Profiler, etc. Somehow you want to get through this one, hoping it might result in a promotion. Suddenly a thought pops up in your mind – could this be because of missing Service Packs? Unfortunately, not all .NET Service Packs are as popular as .NET 3.5 SP1. You hit a bullseye. Production confirms they don’t have .NET 2.0 Service Pack 2 on their machine (.NET seems to have an innate trouble with handling circular references). Plans are chalked out, approvals are sought. Things start falling back to normalcy. Of course, the promotion you dreamed of never comes through 😀 .

This calls for an interesting analogy. When you visit a doctor he checks you with a stethoscope and doesn’t take you directly to OT. We as developers are technically motivated to do latter. We want to analyze every bug by taking dumps or using profilers. Alas, that would only benefit Airlines using which we do emergency travel.

(P.S. Framework detector is a good tool to troubleshoot such issues. Word – “You” is used here to create thrill and hide my infallible mindset 🙂 ).

Food For Thought – Do you accept password as clear text?

There are some subtle bugs we developers invite due to our backgrounds. I recently saw a configuration screen in an application (normally used by only admin) asking for connection string:

Preferred approach would have been:

Of course, connection timeout is missing in both 🙂 .