Our stack, from Monolithic/PHP to Microservices/Scala

Lau Chun Yin Vincent
Translate Engineer Error
7 min readJan 16, 2016

Yes. This post is inspired by the Medium Engineering Team. We learnt a lot in their stack and we would love to share ours to learn more from the community. This is one of the many stories behind a group of geeks in Hong Kong building the localization start-up.

Update: just shared our stack at StackShare as well

Context

Our systems do a lot — processing localization files and integrating with different platforms; allocating jobs to translators around the globe and providing them online tools; and besides the obvious client portal we also have data pipeline for analysis.

Monolithic and those not-so-monolithic

Cancer. Terrible. Spaghetti. Don’t-touch.
There are always terror and probably tears in our engineers’ eyes when they talk about the legacy code. We have been living with a 4-year codebase which we find it really painful to adapt our needs today.

To clarify, monolithic is actually not the major reason. PHP is also not to blame , although personally I do find it really like a hammer.

Our stack is actually not so “monolithic” in that sense. Besides the major PHP on Zend/MySQL application, we have both android/iOS mobile apps, IDE plugins written in Java/Python, Wordpress and even standalone NodeJS modules. Apart from platform-specific need, some of the stack are chosen because it works better for the job. We also have the typical stuff working with the major app — Memcache and Firebase; Gearman for job distribution which we moved to Celery.

History of product changes, crazy sprints and, yes, engineering mistakes contributed to that codebase. We admit we went wrong with tech debt management and probably even rushed irrationally; and we took our lessons with blood, toil, sweat, and tears.
The good news is we HAD the lessons and we are doing much better now.

Microservices, Scala, Akka and Messaging

We did read Joel Spolsky’s advice on “You should never rewrite your system”. Twice, personally. We went ahead anyway for various reasons. We can’t say we made every single step correct but we can definitely say things are improving everyday.

The new architecture is, what today fancily named, microservices. In fact when we kickstart the discussions on what’s next, we are using the plain old word SOA(Service Oriented Architecture), partly thanks to me who was by then just get out from bureaucratic bank and used to work with some top notch Architecture Astronauts. They could be quite different in an Enterprise settings, while both captures what we focus to achieve.

Anyway, we are engineers and we do got kind of immunity to buzz words.
We do find merits in the microservice architecture. Not to mention things are more decoupled — in terms of deployment, development or scaling, it is also more aligned with our working style and team structure. Besides we’re following Domain-Driven-Design and it really transform our development style.

A good visual interpretation of Cornet’s law — Organizational Charts by Manu Cornet

We used RabbitMQ for most of the inter-service communications. We also used Apache Camel to aid integrations. It could sound probably a bit enterprise-ish, however so far we are glad that many our integration patterns can be modelled quite well in the messaging manner. Meanwhile, we are also looking into Protobuf to model some of the messages, not only for speed but better schema control.

One major implication in the microservices architecture is you are going to structure your services per data cohesiveness. Right now for us each microservice is pointing to individual database in AWS RDS Postgres. Eventual consistency is the more common approach on data aggregation and we try to avoid the need of distributed transaction from design.

We decided we will go for Scala in the new stack. It’s an elegant language, being type-safe (❤ shown in some members’ face) and functional. JVM also provides us a great deal of flexibility on tools to use. Many of us didn’t have experience with Scala and it does have a deep learning curve though. Luckily as it emerges as our default-choice-of language, we talk about it days & nights, share best practices and even form study groups so now we are Okay. We still enjoy the benefit of microservices that there is freedom to choose another stack for a service simply when it does the work better and the team on it know them well— some of the services are written in Java/Python/Node.

Mentioning Scala alone won’t be complete. We use the Akka framework heavily for many modules, some with heavy I/O tasks. It’s Reactive.

There are still many stuff yet to be mentioned. Drools the rules engine for more advanced notifications. Elasticsearch for some text-related features.

For sure trade-off exists for every architecture. There are bigger headaches about migrations and there are a lot to be told about microservices, so it will be another topic (How many times will I say this?)

UI

Except some legacy backbonejs, we are mainly on AngularJS. We went through quite a lot and we had been bumping from version 1.1 to 1.4, building many customized modules, working around with some crazy bugs and performance issues dirtily in the dirty-checking-cycles. It’s overall a nice framework with lots of useful features, especially for a relative complex application with different team members working on modules in parallel.

The biggest thing for us is of course to improve the user experience, which speed & responsiveness contribute a lot besides the design. We believe building the UI as a Single Page App works best for us. Our UI folks are continuously improving it with their Google DevTools Timeline tab opened everyday and HTTP2 is very interesting in that sense. Also from a technical standpoint we also focus to make things more modular — Yes, what’s behind web components.

For sure we cared about the client-side localization solution, in which we used both our custom modules and angular-translate (which partly absorbed into the latest 1.4 i18n module). They are great stuff, however for us there are still various limitations. We are working hard to find an elegant solution, which currently look more like a framework-agnostic component build on top of libraries like FormatJS. Another blog post.

Besides, we use GulpJS to build and jasmin & karma for testing; and we really like webpack. For sure, we are adopting ES2015 and keeping an eye on Angular 2.0 / ReactJS.

The infrastructure

We have been using AWS since epoch.
Our wonderful ops engineers built out the infra with AutoScaling and VPC network using Ansible. By then Terraform wasn’t even available, so quite a lot of scripts are written for the CloudFormation. Now we are looking into it.

Docker is introduced last year and everyone just loves it.
We are moving forward to docker in production — Some of the microservices are already using it at live, while our target is to land on a full-blown CoreOS + kubernetes(so hard to spell!) infrastructure very soon.

Jenkins for CI. DataDog and LogEntries for monitoring.

API

For API, we have been powering it up with PHP/Varnish. Since we moved to Scala, we have some basic HTTP Endpoints using Spray.

Part of our application has some complex UI logic and real-time requirement on update. So pure REST didn’t work well for us due to the typical under/over fetching problem and pushing-update-to-UI was an afterthought, and therefore the whole thing (UI & API endpoints & cache) is not very well integrated. Microservices also pose new challenges on aggregating the data from different places. We are looking into Falcor from Netflix, GraphQL/Relay from Facebook for sending the server better queries.

We also find the BFF Pattern @Sound Cloud inspiring. This problem itself is quite complex as we want not only FE/BE (+Mobile!) to work in harmony but also all the efficiency, latency & consistency requirements being met in a distributed settings. Hopefully by the time you are reading, this’s done and another blog post is written to share our way of doing API .

Data pipeline

We are piping millions of rows into Cassandra Cluster. It is self-provisioned and we create snapshots not only for backup but time-based comparison.

We are using Spark for analytics. The API is honestly not so friendly when we just start (better with DataSet at 1.6), but we find it really powerful and elegant when we go for some machine learning analysis and real-time analytics. We are following the recommended topology where spark ride on cassandra directly for low-latency read. Some of the results are pipe to postgresql as reporting DB and visualized with MetaBase and BIME. We also customized Airflow form AirBnB for job management. (Thanks it’s Python!)

For data collection, we have numerous services including those from 3rd-party, like Hubspot, Google Analytics and HEAP for user behaviour/marketing and Sentry/New Relics for application. We also built out some customized javascript modules to capture complex UI behaviour patterns (like click-x-without-doing-Y-in-5-seconds). As many data engineers will say, data collection and cleansing always account 70% time of the task so it is never a trivial engineering challenge.

To know more?

This is quite a lot especially given our relative-small engineering team (4 pizzas, ordering more lately). Many of the choices, like microservices architecture, there was not much “Standard Playbook” and situations are often new, but our curious engineers commit (literally) to take up those challenges everyday and often come up with brilliant solutions, so it didn’t stop us to build better systems.

I didn’t cover everything in our start-up. Probably we are hiding some secrets. In truth, many choices we are still looking for the better solution. There is only one way to know more and you know the obvious answer.

Lastly, this’s one of the very first posts so follow us if you are interested to read more :)

--

--