Do Azure SQL Database External Tables have a place in a micro-service?
I was recently in discussions on using External Tables to link Azure SQL Databases across micro-service boundaries. This has led to some challenging discussions with a client and unexpected opinions internally here at sabin.io.
My simple view of a micro-service is of a data store fronted by code, which is in turn behind an API or message subscriber. Importantly only this code accesses the store. I have arrived at this opinion though many (often heated) discussions with developers implementing services, and though working with teams breaking large services into micro-services to clarify ownership and responsibility, remove dependencies and simplify development and release. At that time, the data store was a hefty on-premise SQL Server instance and the prevailing discussions were around multi-schema vs muti-database, and dependency concerns and side-effects from cross database calls. When transitioning to Azure, I initially struggled with the isolation enforced by SQL Db however, I eventually realised this instilled good micro-service practice by design.
And so it is through this lens that I view External Tables. And it seems I stand alone, notwithstanding the fact all documentation I can find have very a specific use case, namely to support Elastic Database query (also see Querying remote databases in azure sql db and CREATE EXTERNAL TABLE (Transact-SQL)).
What makes a good micro-service
The following image illustrates my view of a good micro-service design.
What is good data design
First, lets leave aside performance and normalisation. In this context good data design is about the service having to hand the data required to achieve its business function. One should be able to rely on the data; it should be dependable.
This is relatively straight-forward when the service is the data creator or golden source however, when it is not as is often the case then this is when things get tricky. Assuming eventual consistency will save the day fails if mechanisms are not in place to know the service is eventually consistent, therefore some form of continuous reconciliation coupled with appropriate monitoring is required.
Another regular requirement for a new service is data seeding, again requiring an initial reconciliation and then most likely some form of continuous reconciliation.
When there is drift, then how should data be resent, are there different strategies required based on the size of drift, and for large data volumes how to do this efficiently with zero or next to zero downtime?
What has this to do with external tables
The client proposal was to implement an external table to seed, reconcile and refresh data from a data creator within a set of logically related micro-services, illustrated below
My immediate push back were the concerns of
- Tightly coupled back-end dependencies
- Lack of strong contract and brittle due to lack of ownership of the shared data
- Performance impact on the source through erroneous size of data operations
- managing security
The real surprise for me was the response from my colleagues here at sabin.io when reaching out for support. Having put this out there expecting a, your right and they're wrong, done and dusted, it was quiet the opposite, for example:
- If this is part of a set logically related services implemented by closely aligned teams then what is the concern.
- Select * is not an API however, select col1, col2, coll3, etc. can be. Of course this needs to be correctly coded.
- The use case is bounded / constrained. It is easy to test and failing tests can be implemented on the shared data owner to enforce the API and constrain change.
To wrap up
I am still not convinced External Tables should be implemented in the back end of a micro-service however, I will admit they can save on engineering time around the challenges of data seeding, data refresh and reconciliation.
Interestingly, this post is not really about whether external tables are a good thing or a bad thing. For me the discussions around this topic really brought to light the fact our opinion is enforced by our experiences, which in turn shape our response long before the discussion has even begun.
To be of value to a client, I have to approach every problem with an open mind and accept a variety of solutions will be put forward to achieve simplification or overcome architectural oversights. Based on my experience I should present my point of view, pointing out what I see as the pros and cons. Importantly, regardless of the solution chosen, it is incumbent upon me to assist on ensuring good engineering rigour to achieving success.