The engineering journey @ Open Universities Australia

Dejan Vasic
OUA Engineering
Published in
7 min readFeb 16, 2023

--

OUA is a marketplace for higher education in Australia with a unique capability in enrolling students into many leading Australian universities and managing their studies in one platform.

During my interview with long-standing members of the organisation back in 2020, they didn’t hold back in describing the problems facing them on the account of a monolith application riddled with complex bugs. Their honesty was definitely appreciated. After being offered a senior engineer position it’s difficult to feel anything less than optimistic and ready to tackle anything that lies ahead.

Until…

The mountain with burning fires

If the road ahead was to tackle technical issues alone, then this post would have been a little less enjoyable.

OUA established Sitecore as their main web platform. Having invested over 12 months that included a major upgrade, nobody wanted to hear about another “re-platforming project” and this was understandable. Despite being made with the best available information at the time, the decision to use the system resulted in an overpriced and inappropriate solution for the complex domain.

There were enough hot spots to render a freeze on a product roadmap which ended up looking like a bunch of small and fast follow features. Product were eager to finally start experimenting on a platform they thought was ready. The engineers were mercenaries without a well-understood common goal but were happy enough to stay passive and go with the flow even though bug fixes were the main priority. At this point, many would have walked away from the opportunity to feel true success in turning everything around.

The culture and process issues were further fuelled by the technology issues facing us:

🧠 Complicated codebase to understand and navigate

🔢 Hundreds of feature flags and dead code hanging around like a bad smell

😕 Unpredictable build and deployment pipelines with flaky tests

⏲️ Pipelines would take up to a day to release changes to production

🧑‍🤝‍🧑 Multiple teams sharing custodianship of the same system that at times ended with a blame culture or shifting ownership on who should fix what

😫 Sitecore and legacy Dot Net Framework (which has been phased out now), required extensive training to learn and understand how to render a simple view

🍲 There were multiple attempts to simplify the UI in the same Dot Net solution but were never really finished so it all ended up being a really awful soup

⚠️ Logs that contained so many errors, that it was difficult to identify the severity of each and so I questioned whether it was being observed

Looking at some of the points above, it’s usually regarded as regular technical debt in any organisation. But combining them together at once is quite overwhelming. What really bugged me though, was that this was meant to be a “new” system.

Shifting the engineering culture

A chief engineer arrived. It was a statement of intent by the organisation that a better tech culture with a solid engineering vision was needed. We were in good hands.

Having someone with an engineering background at that level was a blessing. They implemented good practices with immediate effect that included:

  • Collaborative decision-making processes
  • Empowering teams with ownership of their product direction and architecture
  • Engineering principles like “you build it you own it”
  • A well-understood incident management process
  • Hiring senior engineers and introducing roles like principal and staff engineer really paved the way for technical leadership without necessarily having to manage employees

This was the start of a new era, and I quickly realised it was a good time to be around and be involved in not only coding some new exciting systems, but to be part of crafting a better engineering culture. Having just departed an awesome tech company in Seek, I felt those experiences could come in handy.

The success stories

We knew that picking exactly the right part of the domain to strangle was going to be key to success. This would help us earn a lot of trust from other trades. So we targeted the Course search.

OUA is really a marketplace and at the heart of that is a great search experience. At the time, it was painfully slow with up to 5 seconds response time and a very clunky UI to go with it. Painful to maintain and painful to use. It was a great candidate to rebuild given the decent traffic for experimentation and the simplicity of it being a single page.

Search

Embarking on a new project alone and leaving the other fires burning was not going to fly. Luckily, we had a team that was willing to start really owning this system. After some time, not only had root causes been solved but better practices, test coverage and monitoring improved. The system was actually becoming maintainable. It was quite evident now that we had the talent who really cared about their craft and were pragmatic in solving any problem thrown at them.

We got to work on the course search in the meantime and we had a solution. Using the strangler pattern (running a new system side by side) we could simply divert traffic to a different application entirely within our infrastructure:

It worked like a charm. We didn’t do anything groundbreaking really. And decided to be sensible about tech choices so we went with industry standards like Nextjs, React Typescript and Node GraphQL API hosted in AWS Lambda. Session and cookies were working because it was on the same domain.

We set up an experiment to compare the systems and the winner was obvious.

Eventually, a more technical post will follow on this. But the following are definitely worth mentioning:

  • Search response times under 100ms on average
  • Pipeline and production deploy with full end-to-end testing in ~20 mins
  • Web core vitals (metrics used for measuring website performance) were smashed and all in the green
  • Algolia was chosen as the backend search index which provided a wealth of features to experiment with like personalisation
  • Leveraging Nextjs server-side rendering, the team realised huge improvements we could make to SEO and we followed up soon after with much success
  • Last but not least a fully accessible UI

This had set the gold standard. By focusing on not just building a good product but enabling a strong engineering culture, we were set up for success in so many ways. Decision-making was kept within the team about tools and third-party services, instead of relying on someone outside (probably with “architect” somewhere in their title). We were empowered and felt our confidence rise when we proved our value. Algolia and Nextjs proved great choices as they enabled capabilities that were not even considered earlier. It felt that OUA was making leaps and strides in this area.

React component library and homepage

Like many other organisations that need to create web front-ends in React, we struggled to decide on tooling. Do we build our own component library or go with something open-source? Do we need our own design system? How was this not already there? We knew this was going to be a lot of work, but we found a way to collaborate on this internal cross-team project which ended up being a building block and foundation for others.

Once the pattern was set, other teams had the tools and understanding of how to tackle similar projects. Homepage soon followed and reaped similar benefits.

It’s worth noting that Sitecore still played a pivotal role in the process. It was still the main source of content data where content authors did their magic. Leveraging Sitecore’s new GraphQL API available in Sitecore 9 we were able to query the content tree at either build time or request time to render what we need. Another win-win. Sitecore finally is starting to fill the role of a CMS (content management system) which is one of our long term goals.

Given most of the apps opted for build time, even when larger parts of the site had gone down which was still served by Sitecore, the homepage and search remained standing. We were no longer impacted by a large monolith going down and bringing everything with it.

Identity

The biggest component remaining was the Enrolment process. This is by far the most complex in terms of both domain and technical implementation to re-platform but first, identity was required for a logged-in session. As usual, Sitecore was the platform for everything. We did what we thought is logical. Made a decision document to evaluate options, did some evaluations and eventually selected and integrated with an Identity Provider that offers OAuth. The picture now looked more like this:

How hard can it be right? What can go wrong?

The obstacles were far more complex than Search. Dealing with security, privacy and user migration was a little intimidating but left us in a position where the domain was no longer a limitation and I’m really happy to say that we are now in production. We are now free to build apps outside of Sitecore with access to the currently logged-in user.

Today, as it stands, enrolment is the last remaining system and if we stopped here I do wonder at times whether we’ve left it in a worse place than before. We’ll need a pragmatic and elegant solution and it will be exciting to see this happen.

The challenges ahead

We have been making good progress but the challenges continue as personnel and market conditions change. Today, the battles are between production support, technical quality, a growing backlog, extracting enrolment from Sitecore, and learning our system all at the same time.

This is what makes my job so interesting and keeps me engaged. Technical problems alone like tuning an API to increase by 5% performance are nice to solve, but as a recently promoted staff engineer, I’m probably more excited to contribute to the more significant battles facing us today. Overall we are left with a leaner team and new leadership but still heading in the right direction.

I’m genuinely excited to be part of the engineering leadership at OUA who are creating our company’s engineering strategy and vision. By having this embraced in my product team, we are finding the alignment we were always looking for. This year, I look forward to telling more stories with some deep technical dives and sharing them with the community.

--

--