Let’s face it, the micro-services are old, they are not innovative and they were not designed with cloud native applications in mind. The concept of micro-services must be generalized to encourage the adequate use of different additional patterns such as serverless functions, actors and batch in complement within a context of reactive programming.
Most of us know the development of distributed computing under the business context where even OSS projects have been guided by the interests of technology providers. Evolution has been slow and repetitive in an effort to recover the huge investment in different solutions. Historically, the concept of well defined, idempotent and immutable services is at the heart of http and in the programming prior to Object Orientation. The start of SOA and the ESBs was promising, distributed and service oriented. The result after passing through the providers, was a forced adaptation to take advantage of the investment in enterprise message brokers and in proprietary solutions justified under the WS-* umbrella, tying the services to specific technologies that are most of the time, not necessaries. The boom in the adoption of REST and Micro-Services services has only brought us back to the promising state before the proliferation of SOA.
There are exceptions such as Erlang, which was developed with the concept of distributed computing in its design and which inspired solutions based on actors like AKKA and the Orleans Project. In the academic and scientific world, the vision was different and previous concepts to the Cloud under the name of Grid computing, encouraged the ubiquity of computing and the use of autonomous agents for processing. This set of tools was again adopted by suppliers under the Cloud concept, delaying the beauty of distribution and autonomy to a set of centralized but automated Data Center providers. In this case, the investment to be protected is not only the investment in data centers, but also the investment in virtualization technologies. The promise of managed systems is marred by the fact that until the Cloud Native Foundation does not exists a real effort to collaborate to provide common APIs and allow true distributed computing.
During the adoption of the Micro-Services, the providers again have conveniently concentrated the discussion in the “where” and not in the “how”, protecting the investment in platforms that have not been created to support real micro-services creating the artificial discussion of the size of a micro-service (nano, micro or macro?). In practice, a micro-service has the size that contains resources that share the same life cycle (creation, destruction, scaling) and have external components (backing services, services) in common and limited to share a common life cycle from the point of view of the micro-service. This leads to a collateral effect, generation of a significant number of micro-services that scale horizontally and use few resources leading to a very high cardinality of instances and a high dynamism in their states. The current platforms were not designed for this scale where the instances are managed in group form and it is not possible to have control over a particular instance. Just Kubernetes stands as a platform that supports this kind of behavior very well, in fact, it works badly with centralized and distributed monoliths.
Assuming a suitable platform for the micro-services and a correct implementation, its administration is still costly and complex and is justified in resources that need immediate request responses with latencies of the order of tenths of a second. In a cloud native application, most resources support a delay. It is essential that requests be released as quickly as possible and use a communication mechanism where the flow always follows only one way. An example of this type of design is Facebook’s Flux. To encourage this type of design, communication must be event-based and even communication to the client must be asynchronous whenever possible. In the case of http, a solution that fits very well to the flow model in one way is the use of REST for the request and SSE (Server Sent Events) to receive events. The use of websockets that implements a full-duplex flow, should be relegated to situations where bi-directional communication is needed in real time as a massive online game.
When flow is used in one way, it is not necessary to wait for the response to release the request. For this event communication model, maintaining active micro-services is not efficient and serverless technologies based on functions such as AWS Lambda, OpenFaaS or Kubeless are more suitable. This type of technology is restricted to extremely simple and lightweight functions and is justified in response with latency in the order of seconds up to a couple of minutes.
When tasks exceed several minutes, it is no longer necessary to have resources waiting because the cost of initializing resources is not significant. Under this model, the best solution is simply to create and destroy resources on demand without the restrictions imposed by serverless functions systems to allow bounded start times. Today, there are several technologies that allow this to be done as AWS Fargate, Azure Container Instances or simply lightweight containers such as Google Distroless and Gvisor, Firecracker or simply Unikernels. The question remains whether orchestration is necessary. Orchestrators like Kubernetes stand out for the discovery of services, the control of health and the administration of resources. The truth is that when you create and destroy resources on demand, paying the high cost of an orchestrator does not deliver benefits, you only need a scheduller who understands events.
Finally, we have the case where the latency of data is more important than the response time as in the case of IoT, shopping carts, states and in general everything that involves working with individual entities. Micro-services in general are implicitly designed for groups of entities and are stateless. This favors operations in groups of entities but is inefficient when working with individual entities. The actors model responds better to individual entities and in this case it is better practice for the same actor to control their status. Azure Service Fabric takes this to the extreme by managing the state as a platform not only for actors, but also for micro-services through partitions. Services without a state has been considered a good practice, however, if the platform is able to guarantee the state in the micro-service or actor in a reliable and transparent way, the reduction of latency and scalability is unbeatable.
An important point to use micro-services, agents or batch events without falling into the anti-pattern of the centralized messaging bus, is to use a cloud native messaging system. Systems like RabbitMQ or Kafka were not designed with the cloud in mind and are heavy and difficult to manage and scale. To conserve the distribution it is necessary to privilege the peer-to-peer messaging and only use complex messaging systems for the control points. An example of a native cloud messaging system is NATS. In the case of an orchestrator, it supports Kubernetes as a Custom Define Resource and when there is no orchestrator, it supports auto discovery to create a mesh network. It is even capable of supporting multiple clusters without federation and dynamic topologies.
Once we have autonomous services communicated by events, we must implement persistence sources that support this scheme. It is not viable or desirable to have atomic transactions under these conditions, so an event-sourcing system, natural for distributed systems that communicate by events, should be implemented as data persistence. Under this scheme, it is interesting to define that each event is immutable in an absolute way and converge the problem of messaging and storage to the use of BlockChain, a concept that can potentially scale without point of failure in a massive and secure way.
If we enforce all the discussed patterns to be autonomous, that is, not only do they not require human intervention for their creation and configuration (automation, 12factor), but also they do not require human intervention for their operation and they are capable of self-healing and keep working even though be isolated, we got agents.
If we define that micro-services are autonomous and communicated by events and we generalize to cover the micro-services patterns with state/without state, serverless functions, actors and batch serverless tasks, we obtain an architecture composed of bounded agents, micro-agents . This concept is a generalization of what is proposed in the Agent-Oriented Micro-Services article.
It is important to note that single points of failure should be avoided. It has already been discussed avoiding centralized message systems but there are some practices that encourage a single point of failure such as the BFF pattern. In this example, the first thing is to evaluate if it is necessary to have a single service that centralizes the transformation. The API GW for services or the pub/sub hubs for events, decouple the implementation from the API/Event, making it completely feasible to distribute the transformations between the micro-services. Moreover, technologies such as GraphQL decouple the transformation and aggregation of the service and use a standard protecting the bad practice of implementing business logic in the BFF and increasing stability.