paul vidal - pragmatic big data nerd

Monthly Archives

2 Articles

Data As A Microservice: the future of data architecture

by paul 0 Comments
Data As A Microservice: the future of data architecture

Let me preface this article with an understatement: sometimes, enterprise architecture can be complicated. Large companies run thousands of applications, multiplied by dozens of environments replicated for testing, user testing, sandboxing, accumulated over years of acquisitions, re-architecturing (yes, it is a word I made up), and experiments all with the purpose of driving business forward. Like any complex system, human beings have been trying to make sense out of it by conceptualizing models and architectures aimed at simplifying the system thus making it more efficient, robust, scalable, secure, and spiritually vertuous (OK, maybe not the last part, although can a piece of software be inherently virtuous? A question for another day). With all this in mind, I would like to take some time to reflect on one of these concepts: Micro-Services and how this concept can apply in the realm of data management.

Microservices VS Enterprise Service Bus

First introduced in 2011 during workshop of software architects held near Venice in May 2011, Microservice Architecture is defined by James Lewis as follows:

The term “Microservice Architecture” has sprung up over the last few years to describe a particular way of designing software applications as suites of independently deployable services. While there is no precise definition of this architectural style, there are certain common characteristics around organization around business capability, automated deployment, intelligence in the endpoints, and decentralized control of languages and data.

Microservice architecture is a subset of a Service Oriented Architecture (SOA), aiming at distributing microcomponents to deploy applications as opposed to a centralized application integraiton layer, often called Enterprise Integration Layer (EAI) or Enterprise Service Bus (ESB). Leaving aside the obvious angry developer argument stating that all of this is marketing jargon and rebranding of the same products, it’s interesting to take note of a fundamental trend I covered before in this blog: enterprise are looking to implement agile enviroment, which extremely granular elements in order to ensure business reactivity. The dream of the all-integrated all-consolidated entreprise layer is fading.

Data As A Microservice

In a very similar manner, the idea of single source of truth containing all the enterprise data is coming to an end. And, unlike some data lakes proponents would like to make you believe, it is not because of the pitfalls of traditional technologies that can’t handle large volume of data or distribute it efficiently. Building a single centralized source of data is an utopia. Instead, companies are now shifting their focus towards platforms enabling rapid agnostic data integration, agile data schema modification, and complete distribution. These platforms can then be used in a microservice architecture, making them Data As A Microservice platforms. I’ll admit, I may have made that term up too because it sounds cool, but it is very important to note for you data vendor, data scientist or data consumer (CIOs and CTOs organization). The future of data microservice-like agility, not monolithic unification.

References:

http://martinfowler.com/articles/microservices.html
http://stackoverflow.com/questions/25501098/difference-between-microservices-architecture-and-soa
https://www.voxxed.com/blog/2015/01/good-microservices-architectures-death-enterprise-service-bus-part-one/
https://en.wikipedia.org/wiki/Microservices

What is the most underrated aspect of software development and why is it measurability?

by paul 0 Comments
What is the most underrated aspect of software development and why is it measurability?

Designing and developing software is complicated. I have heard there might even be a full industry gathering experts in this domain, and that it could be doing well. Not sure if it will ever be a thing. All joking aside, theories about the optimum way to approach software development are numerous and constantly evolving, which is excellent. Today however, I want to talk about an underrated concept, especially within the realm of software development: measurability. Despite online dictionaries results, I’m pretty sure I just made up that word, or at least the concept attached to it vis-a-vis software development, so let me define it.

What do you mean by measurability and why should I care about it?

Within the realm of software development, measurability can be catalogued in the same category as other transversal high-level concepts, that must be considered at each and every step of the development process, such as user experience, performance, scalability, re-usability and security. Measurability in this sense is the idea that each and every feature of you develop for your software can be measured for popularity and efficacy in order to ultimately evaluate its necessity. That is a lot of y-ending words, which should have convinced you already. Hoping it didn’t, let me explain you why it is important to consider. First, I believe that the importance of these types of high-level concepts does not need further justification: we have all witnessed software failures when their development ignored one of these key concepts, security being the one making the front page most often. The impact of measurability is more subtle but nonetheless crucial. Without measurability, decisions you make about feature prioritization or design become irrational. For instance, if you are developing an API that contains multiple methods of access, if you are unable to measure their popularity or efficacy you will end up with either features that are being costly maintained for no benefits to your end user or features that are massively used by necessity but incrementally building your end user’s frustration. This is a very simple example but it illustrate an underlying notion that we rarely see in the world of zeros and one: irrationality. Indeed, a piece of software is usually extremely rational and quantifiable, which makes evaluating performance, scalability, security or even re-usability a relatively easy mathematical problem. With the advent of software popularization we see user experience has been on the forefront of Agile development, making customer feedback a key piece of feature release. What I am proposing here is to go one step further. Whenever developing a feature for your software, one should ask himself: how will I know if this feature is necessary or not? How will I test for it?

Implementing measurability

Implementing measurability acknowledges the fact that you are operating in an uncertain environment, which inherently makes its implementation uncertain. That being said, a good starting point is to measure its use and performance and then compare it to the other features you develop. This measurement and analysis can be done using trace or audit mechanisms, which, bonus, you should implement anyway to cater to security. A more robust approach would be to first select the metrics you want to measure for each software feature and have a dedicated module to implement measurability over those metrics. You may think it’s an overkill but with the advent of scalable and cheap storage, why not do it?

Beyond software development

Big Data, monitoring, analysis data science, all of these concepts are design to increase the world’s measurability, and they are definitely what everyone talks about now. And while the idea of being data driven in any aspect of our lives, from corporate management to personal fitness, it has yet to really make an impact within the realm of software development, or at least the tools dedicated to measurability only are scarce. That being said, making rational decisions does not seem to be as appealing to me as it is for the rest of the world, which could explain this scarcity.