The challenges of software development at a rapidly growing fintech — Part II of II
In the first post of this series, I discussed the birth of Creditas’ Servicing Tribe and what it was like to develop a new context in an application that was already in production.
Reading the first post is essential to understanding the context of this post, so if you haven’t read it yet, you can start here.
I mentioned in my first post that, after the Calculator had been created, we faced the challenge of extracting it to a new service in order to pass the torch on to the Pricing team, whose scope is to unify all the calculations which emerged throughout the company into a single place.
I split this post into two parts:
- Historic context
- Servicing Map context
- Challenges and opportunities of a new context
- The development process
- Other sailors on the horizon
- Integration with the Pricing team
- Extraction strategy (Branch by Abstraction)
- Success factors for extraction
- SOLID applied at the architectural level
Integration with the Pricing team
In the first part of this series, I talked about the possibility that we would adopt a new language to solve our domain problems that arose due to the new context of the Calculator born in Servicing.
The main languages we considered were Scala (for its Functional First nature), Python (for its mathematical libs), and Kotlin (for its ecosystem and compatibility with other movements that already existed inside Creditas).
Fortunately, we decided to be pragmatic. We postponed the decision to leave Ruby to time with more business certainties and a better idea of the direction the company would take with regards to these new stacks.
Considering that, at the time, the cost of creating a new context within the existing application was low, so we stuck with Ruby.
The other teams at Creditas shared this same mindset, making their decisions with a cautious and systematic vision, given that many new resolutions were appearing within the company. Thus, for the moment, the Pricing team was conceived with Ruby, which helped reduce costs (for everyone).
This gave us many comforts: there wouldn’t be a Ruby learning curve among teams, we had agility because we wouldn’t have to “translate” one language to another, and we wouldn’t have infrastructure issues since we already possessed expertise in provisioning a stable production environment in Ruby.
Only alignments between the business and the structure of our project were necessary in order to pass the (context) torch to Pricing.
Success factors for extraction
To what do we owe the success of this complex demand, delivering both the behavioral and structural value of the application, all without losing business opportunities?
This quote by our Rafael Manzoni, the Tech Leader who led our team’s deliveries, has my approval:
I think that, yes, SOLID has given us dynamism and decoupling in the calculations, making it easier to configure each component. Yet the success of the extraction, in my opinion, was due to the joint use of strategic and tactical DDD patterns, such as the Context Map and the Integration between contexts using ACLs (Anti-Corruption Layers), which facilitated everything.
Note that if we wanted to change the Inter-Process Communication strategy, being it by internal controllers, HTTP requests or any other protocol, we could just insert a different implementation of the Gateway without much effort.
I believe that the successful creation of the decoupled context and its ease of extraction can be attributed to the aforementioned factors, but I think there’s another special ingredient which hasn’t been mentioned yet:
A team that — above personal interests — plays to win as a whole, not only thinking about the technical solution for the Squad or the Tribe, focusing instead on solutions for the entire company.
Here is a list reflecting upon some of the challenges we had on this journey:
- How will the Billing context access the Calculator with minimal coupling?
- Where is the limit when it comes to creating excessive complexity between these integrations? Should these decisions be postponed due to the possible technical debts or advance them at the risk of possibly designing these abstractions incorrectly?
- How would the integration tests be done in the layers of the Billing application? Reliability with duplication versus confidence with DRY?
- How will the design of the Calculator’s Use Cases work, given that it can become an API in the future?
- What on earth is Integration between Bounded Contexts, and why won’t anyone let me code in peace?
All jokes aside, it was thanks to the know-how of some very experienced developers, many debates, and an exploration of DDD Study Groups that enabled us to learn and apply everything the way we did.
Are you interested in these challenges and study groups? Let’s get to know each other.
Extraction strategy (Branch by Abstraction)
We started the extraction by making a copy of a context to a new service in Rails API, which would be accessed by our main application via HTTP calls. Thus, we would have the opportunity to learn how the integration would behave in this new scenario. Everything was looking good until, right from the start, new challenges arose: Reliability and Latency.
We had to address Reliability first since the service was new and we had to ensure the calculations wouldn’t fail during our daily processes. Therefore we decided to proceed with a Branch by Abstraction strategy, where we basically replaced the calculator’s call with an abstraction that allowed us to use the new flow and the old flow, if the former were to fail. Basically, all the classes that use the Calculator at the application layer of our codebase came to use one service and two gateways:
- External gateway, which realizes the calculation with an HTTP request
- Internal gateway, which calls the controller from the local copy of the calculator
The way that our Fallback component is used, which realizes switches between our legacy flow and our new flow, looks something like this:
# service.rb@fallback_wrapper.new(main: @http_calculator_gateway,fallback: @internal_calculator_gateway,error_event_name: ‘installment_calculation_failed’).perform(dto)
The Fallback component sends a perform message to the HTTP gateway, passing a DTO (Data Transfer Object) as a parameter for both. If the principal flow fails, the alternative internal gateway flow is used, and the error message ‘installment_calculation_failed’ is logged in Transactional Events in New Relic.
This strategy ensures proper function using both flows and helps us guarantee our service’s reliability.
We then had to deal with Latency since we now had to perform an HTTP call for each installment. The problem was that some use cases calculated 180 installments at a time. Our solution was to abstract the gateways and services so they made batch calls. We lowered our latency from minutes to mere seconds.
Our Billing context continues using the internal Calculator as a Fallback in some cases, but our monitoring of these errors in APM has already shown us that these parts of the flow will soon become obsolete because of how rarely these errors are logged. In other words, we’ll soon be able to delete the legacy code and depend solely on our Pricing microservice!🎉
SOLID applied at the architectural level
The principles of SOLID were applied in several ways throughout this process, from creation to extraction. Here are some highlights:
SRP: Single Responsibility Principle
More than simply “Do one thing”, the segregation of the Calculator, corroborated by Uncle Bob, communicates something more important: “Only one reason to change”. Given that this module was isolated to a new context, the Calculator’s experimental and volatile business interests (ex: new policies every week) don’t interfere with the responsibilities of Billing and Onboarding (entrance of new credit). There are no shared responsibilities between contexts, so it was easy to work on the different fronts of the same codebase together.
OCP: Open Closed Principle
We protected the high level (business rules) from the low level (communication between contexts) through hexagonal architecture. Our use cases — those such as the daily processing of the portfolio and the calculation of the inflation index — pointed towards the domain, not the other way around. Thus, the extraction of the calculator was simple and the more stable classes of Billing didn’t need to be altered. Moreover, the installment-calculating components could be configured with policies that didn’t imply alterations in the calculation bases. This way, we can understand our functionalities without altering them.
LSP: Liskov Substitution Principle
With small adjustments, we can easily create Fallbacks that use the same contract (common interface with identical inputs and outputs). In other words, we can easily substitute one implementation with another and expect the same behavior. Another interesting example is that, regardless of what type of calculation we realized, all the Node and Leaf components expressed standardized interfaces.
ISP: Interface Segregation Principle
As we know, the ISP declares that clients shouldn’t be forced to depend on methods they don’t use. At the level of code, we could simplistically say “no methods should be unused”. When we get to the architectural level, the ISP reveals itself in a more subtle way. Its similarity with the SRP is noteworthy, but while the latter (SRP) refers to changing interests, the former concentrates on the logic of communication between clients. This will be clarified by the following example.
Our first Service version had three use cases all in one Service: calculation of the components of an installment, creation of an installment plan, and a calculation of the inflation index. In order to avoid cross-module dependencies when creating strategies for flow bottlenecks — reducing complexity and mitigating the probability of risks — we segregated the interfaces into more granular pieces (three services) so we could proceed with the extraction in an incremental manner which is safer.
DIP: Dependency Inversion Principle
By inverting the direction of control, we guarantee that the business layers of our architecture stay in the dark about how or where calculations are done. We injected the HTTP calling strategies via the gateway and the internal calling via internal controllers, depending only on abstractions and therefore inverting the control. This encapsulated batch processes so as not to add complexity to the orchestration layers (Application). Besides this, the Repository pattern applied — even if it is Ruby’s style, which doesn’t have an injection container and strong typing interfaces — is an excellent example of how we pass low-level control to these classes, avoiding that the domain knows about persistence strategies.
Here I end this series of posts where I describe, in a non-exhaustive manner, the application of system development principles that tackle the problems of complex domains in constant evolution, with severe caution in decision making.
I hope to have advocated constructive reflections and materialized this abstract world which resides in books with real examples that we’ve run into around here at Creditas.
Leave your cases in the comments so we can learn more together!
As a dev-ex-designer, I still do some scribbling between codes. Of course any iterative work — including the writing of code — results in some rough drafts prior to the final version. I choose to use the illustration of SOLID on a tape measure rather than SOLID as architecture in this post to depict that despite all the architectural lectures, we use our fundamental daily tools to be excellent software masons.
Thanks for your time and until next time, folks!
Want to use technology to bring innovation to the loan market? We’re always looking for people to join our Crew!
Check out our openings here.