Transitioning from SAP BW to Snowflake

Perspectives on Getting Work Done in the Cloud

Kelly Kohlleffel
Hashmap, an NTT DATA Company
20 min readJan 12, 2021

--

by Sergey Gershkovich and Kelly Kohlleffel

The coast of Mallorca, Balearic Islands

I recently had the pleasure to talk with BI Solution Architect, Sergey Gershkovich, on our podcast, Hashmap on Tap. He lives on the island of Mallorca, one of the Balearic Islands off the coast of Spain, in a town called Llucmajor. Mallorca has been on my radar for many years as I’m a big-time Rafael Nadal fan. We talk about Rafa (he had just won his 13th French Open title), what it’s like living in Mallorca, incredible Mediterranean cuisine, and a number of other topics on the show (you can listen to the podcast here).

Our main focus was on SAP BW to the Snowflake Data Cloud and that’s the part of the conversation that we’ve condensed down into this format. He shares some really interesting insights and thoughts on Hotelbeds’ journey from SAP BW to the Snowflake Data Cloud and how that accelerated their ability to deliver data products in a way that was simply not possible previously.

Kelly: Sergey, you’ve been at Hotelbeds for a while. When you first became aware of the company, what sparked your interest, and how much have things changed since some of those early days when you joined the company?

Sergey: What really caught my attention was that they not only had the full stack of SAP BW business intelligence, but they were also on SAP HANA, which at that time, five years ago was quite revolutionary. In fact, very few people had actually seen an implementation of SAP HANA. It was very new to the market.

Kelly: For those listening that maybe are not familiar with the SAP landscape, SAP BW is SAP Business Warehouse. SAP started with BW on Oracle RDBMS and then moved to offer BW on HANA. It’s more of a traditional data warehouse architecture and platform and a number of our customers are looking to move from SAP BW to Snowflake for a variety of reasons. I came across your article on Medium and really it caught my attention. I thought you did a tremendous job of breaking down the move and comparing and contrasting the two technologies.

You talked about Hotelbeds moving to the cloud for their data warehouse platform in less than a year. I’d love to talk about that transition. Take us back a little bit in time. What even caused you to start looking at something beyond SAP BW?

Sergey: To be honest, it was purely coincidental. Hotelbeds had undergone a couple of mergers. We bought two of our competitors and one of those companies was running their BI on Snowflake. We inherited it and initially thought would be just another legacy system to deal with. The plan was for our principal reporting to still be done from SAP BW.

We immediately began to scale out a project where we ingested all the legacy data in order to put together three-way reporting across all companies. We needed to ingest the legacy data for historic reasons and for ongoing operations reasons. We needed to be able to report on three different systems as if they were one transactional booking system.

That’s what we started doing. We quickly realized that Snowflake was fast and easy. Putting views on top and the ease of cleaning up the data really got us considering, “Wait a minute, does it not make sense to rethink what we’re doing here overall.”

The time involved, the ease of modification, the ease of iteration, the approach, and the way Snowflake lends itself to an agile methodology caused us to stop and rethink everything. Then when we compared the timelines we were looking at, even once we had the model setup, just to ingest the data into BW, there was absolutely no comparison. It was night and day between doing it in Snowflake and doing it in BW. It was at that point that we changed course to Snowflake.

Kelly: How did you approach a bulk migration from BW and the ongoing incremental changes to source systems?

Sergey: We looked at this, and again, everything was driven by the reporting requirements of the acquisitions. We knew that it wasn’t just our central model anymore, and we now have legacy data that we’re going to need to somehow integrate. Some of it matches up, some of it doesn’t. We thought, “Let us go back to best practices. If we’re going to do it from scratch, let’s do it from scratch correctly.”

We didn’t want to take shortcuts in the sense of if lifting and shifting calculations in BW, but we took an ELT approach whereby we could load raw staging data into Snowflake from all three systems. Then we would start modeling — because again it’s very similar data, but you’re still going through a process of data discovery, functional discovery, requirements change slightly.

We also realized that there was additional detail that needed to be included in the model in order to harmonize it three ways between all of the systems. We went back to first principles and reinvented the model looking forward and not looking backward so that we weren’t constrained to something just because it’s there.

Kelly: Snowflake gives you so much flexibility to do data modeling in a variety of ways. How does Hotelbeds approach data modeling in Snowflake?

Sergey: Because we’re BI focused, we tend to operate quickly. With Snowflake, you can spin up an extra-large warehouse and just reload everything in a matter of hours, literally. You can have historical data for the entire timeline of all three companies, within an hour or two, whereas before, it would take us a week to load just for one company’s data in SAP BW.

So, we saw that as an advantage we had. With Snowflake storage is cheap and compute is relatively cheap, depending on what you’re trying to do, but it’s also scalable, so you’re not limited to whatever you paid for upfront. You can really think about what it is you’re trying to do. You can play with sample sizes, make sure that the model that you’re working from makes sense, and then go ahead with the full load.

It was in the end, our model was very simple. It was getting the staging layer right, and the rest took care of itself. We also did a lot of work on harmonizing. This wasn’t technical work, but detailed functional work. Ensuring we had the right columns and one standard set of KPIs, a standard set of definitions of what a contract is, what a hotel ID is, what a hotel name is, and making sure they aren’t’ duplicated.

Kelly: You bring up some fantastic points Sergey. When I think about the typical challenges in a legacy architecture like SAP BW or any traditional data warehouse, number one, I’ve got to be pretty perfect on my upfront hardware and infrastructure design. I don’t want to recommend a hardware platform and a software configuration that six months later forces me to go back to the well. You’ve got these long drawn out 3-year capital planning and budgeting discussions and all the effects around that. On the other side, which you alluded to as well, I also have to spend time on the design. I need a perfect data model. I need to think about every single use case, every single way that this data may be consumed. As you said, I think on both of those dimensions, Snowflake, allows you to flip that thinking on its head.

Sergey: Yes, absolutely right. I think that also, it’s worth mentioning that we are talking about two different products fundamentally. Before we even thought about a technical decision, we had to as a team, make an existential decision because as you can imagine, we were an SAP house, SAP BW, reporting was done in SAP, and even our ETL was part of the SAP ecosystem.

As an SAP developer, you’re dealing with a niche product, and a very black-box approach, even though it still relies on fundamental database principles. Instead of a column, you call it an info object. Instead of a table, you call it a DSO. Our team collectively, at that point, had already gotten our hands dirty with Snowflake and we had to think about as a team did we want to fundamentally change where we’re headed in our careers? Everybody went all in, and we were all learning on the fly. It’s so much more exciting to do whatever you imagine in pure SQL in Snowflake instead of saying, “Okay, I have a GUI and it’s great, and I can drag and drop and it’s more visual, but I can only do what the GUI allows me to do.” Once we saw what was possible with the first principle approach, everybody was and I do mean everybody, was on board.

Kelly: All of your team were very much BW specialists. Give me a sense of how long it took to really start feeling comfortable in making this transition to Snowflake, working within the Snowflake environment, the tooling, and the differences?

Sergey: The fundamentals were already there. Obviously, if you start speaking to anyone in the BW world about joins and tables and unions and whatever, they’re going to automatically be able to carry the conversation. It’s like your muscles atrophy for not having used them, even though you remember the operations, you don’t necessarily remember the syntax.

“Snowflake from the very beginning inspires confidence because you can see how everything just fits in its place.”

It doesn’t try to do too much, anything that’s not core to their central platform, they allow third parties to do and many connect seamlessly via their Snowflake Partner Connect.

Even though we hadn’t been on Snowflake very long, you could just tell that it wasn’t going to be the type of thing to surprise you with errors and bugs and things that you didn’t want to be doing instead of your actual work.

We’ve been on Snowflake for over two years and I think we’ve had to raise two tickets. One of which was our fault, the other one, it was a bug, but it was a one-time bug. We reran the script and it ran fine. Whatever it was, it was momentary.

With SAP, not only were the tickets much more frequent but because of the frequency, you don’t develop the confidence that things are going to work well. Every time they roll out a new update, we’d say, “Okay, let’s wait a while until they update the update.”

Kelly: Sergey, I know that I talk about Snowflake “just working” a lot, but I liked the way you talked about the no surprises aspect and being able to have confidence in what you’re doing. Snowflake has taken the infrastructure effort out of the equation for all of us and allowed us to focus more on business outcomes are that we’re generating, and that is what is so refreshing. That level of confidence where the infrastructure is completely taken care of and you don’t have to deal with it is — that’s just unheard of, at least for a data platform.

What are some of the other data sources that you are capturing today in Snowflake?

Sergey: Our primary backend database is Oracle and we had legacy systems that we had to import, which were SQL Server. We also have AWS RDS sources and Salesforce. Additionally, we use various market data providers that we get proprietary feeds from. There’s a lot there, but again, it all fits into the infrastructure very, very easily.

Kelly: You talked about that confidence that Snowflake inspires. Is there anything that you can equate that to, as it relates to that lack of infrastructure effort and not having to really even make that call, just knowing that it’s going to work all the time?

Sergey: It reminds me, if anybody’s familiar with it, the Maytag repairman commercials that used to run where the whole gimmick was that the Maytag repairman doesn’t have any work to do, so that became our Basis team. Basis is the SAP equivalent of the database administrators. There is no Basis team for Snowflake.

Once we started relying on Snowflake and again, these were the cost savings that we didn’t necessarily recognize upfront.

You almost feel like a beta tester sometimes when SAP rolls out an update. If there’s anything in the update that’s really vital to you, that’s worth taking a risk on, you upgrade and then you serve as the “beta tester”, you raise a few tickets, they fix bugs, and you’re running until the next month where there’s a new update. With Snowflake, in contrast, all you really need to do is decide do I want to scale up or scale-out. There are not many decisions you have to make. Snowflake does the “database maintenance” completely behind the scenes. That’s it.

Kelly: You talk about, Sergey, the amount of effort that goes into an SAP environment. I can remember walking the halls at large customers and seeing a sea of operational database administrators, and application database administrators, all for SAP and SAP BW, and I can remember those two teams constantly bickering and fighting. “It’s your fault.” “No, it’s your fault.” Nobody wanted to take responsibility. The Ops DBA claims that things run perfectly and the Apps DBA claims to have written perfect code.

How do you think about data warehouse optimization with Snowflake? Are you doing any and have you had to do much over the last two years?

Sergey: No, the closest we’ve come to that is on particularly large tables. We have had to do some clustering. Snowflake doesn’t index and there are no primary keys or indexes. When we have a very large table if we do cluster it’s to provide a hint to say, “I’m going to be querying the data based on this, so why don’t you just store it in buckets that relate to that dimension?”

That’s really it as far as tuning, if you can call that tuning. For anything else, Snowflake provides a console where you can access the query plan of anything you run, any view, any select. Everything behind the scenes, Snowflake takes care of. They have an optimizer that’s completely transparent and you don’t have to worry about.

That was something that really was the hallmark of a good BW developer. Did a BW developer know how to get into the options and tick all the right boxes to optimize a query for just this select, but not another select? It was always great if you knew those options, but it also felt like you were investing time into something that wasn’t your primary job. On Snowflake, if you get the query right, Snowflake will take care of the execution plan. It’s that simple.

Kelly: In general, over the years, we’ve been very dependent on IT processes and timelines, which many times are very long when it comes to data warehousing and data platforms. How have you been able to compress those timelines with Snowflake? Also, is Snowflake helping you shorten your data product development cycle?

Sergey: Absolutely. Snowflake shortens the development cycle quite dramatically. If you go back to what I said earlier about being able to get at a problem from first principles and not be limited in any way to pre-configured solution constraints.

If you’re working with SAP, it has all the APIs and the extractors. It has the standard models, but working outside of those standard models, for instance with employee data in Salesforce, is challenging and can create a lot of manual work.

SAP requires more of a step by step approach versus if you thought of a clever way of doing something (imagine you wrote a windowing function), which you couldn’t do in SAP, at least not unless you abandoned the GUI track and went for pure SQL. You could do that, but then you lose all the benefits of having the SAP graphical interface.

With Snowflake, your creativity basically becomes the driver, and if you can think of an elegant solution, you can implement it in pure SQL. I think that’s been the biggest thing — we are able to get something in front of a user and get immediate feedback because it’s so quick to implement.

Snowflake allows us to avoid investing in yet another SAP project and needing to move it across all the environments. Snowflake allows us to create a mock-up of the production system by cloning productive data into our test system and do everything right there. It’s instant.

Kelly: I would imagine that as you and the team have delivered quicker and better that the bar has been raised. Are users expecting a lot more out of you on a daily basis?

Sergey: Absolutely. We were heroes for the first 10 minutes. Users are traditionally a conservative group, especially once they get used to their reports and a certain way of doing things. We did have the dual challenge of the technical migration work, but also we had to convince the whole company that what we were doing was the right thing.

We thought the way to do that should be organic whereby we strive to deliver more with Snowflake than what they were getting with the old BW system. We added new functionality increased velocity through continuous development and continuous delivery.

If users needed three-way reporting (which everyone did), it was only on Snowflake. Initially, some of the reports weren’t as feature-rich as the BW reports, but making incremental changes allowed us to keep enhancing and keep providing more and more value very quickly. Today, all of our users are 100% on Snowflake reporting.

We have Tableau on top of that and unanimously the feedback has been 100% positive. If you’re talking about standard reports, it’s very comparable. You have a table here and a table there, one table is indexed, the other table is in Tableau but what Snowflake allows you to do is to give access. It democratizes the data essentially. It removes us as a bottleneck for getting access to information.

You can’t point a user at BW and say, “Here you go,” because there’s a very steep learning curve whereas with Snowflake, you can absolutely provide a self-service model for reporting.

The reports are standardized and have predefined governance around them, but also if you have people who are even a little bit technical, and they want to start doing their own analysis, that’s available as well.

You don’t have to clone it. You don’t have to copy it. You don’t have to worry about whose data is out of synch and why does one team get to control one database and not the other. There’s no siloing, no shadow BI. Everyone can take a do-it-yourself approach, or guided approach, or leave it completely up to the individual teams.

In that sense, I think that once they realized that they’d have “like for like” in terms of their standard reports (now using Tableau), but also be able to access the source data, it made a lot of teams very happy because they were able to do more than what their traditional BI environment could provide previously.

Kelly: You hit on many good points, Sergey. I think maybe I would sum that up to say, Snowflake brings you much closer to that true self-serve model. That’s been very elusive over the years and we’re still not all the way there obviously, there’s still some work to do, but to give that type of access to business users with contextual, curated, usable, trusted data sets accessed with Tableau is incredible.

Alright, I have to ask you this one too. We get so many questions about cost. Customers moving from a traditional Capex model with SAP to Snowflake’s consumption-based model don’t know what to expect. Have you been happy with the move from traditional Capex to consumption-based? Has it been roughly what you expected when you planned it out?

Sergey: Yes it has. Snowflake makes it pretty straightforward to calculate your usage. If you go on their website, they have a very simple table of what a credit is equivalent to and what a credit is worth. You can very simply estimate your user load and query load, over the course of a week or a month, and just multiply that out. It also means that you’re 100% free to experiment. You don’t have to plunk down a lump sum for a new project with a “potential” return on investment, which your project owner has to argue for and defend.

Kelly: Yes, I can control my costs, but I can actually experiment if I want. I can spin up a particular size on demand. That’s huge. I think. Especially we’re talking about analytics and data science workloads as well.

Sergey: Yes, absolutely. Also, you avoid the eventual trap that every on-premise system is going to fall into which is the end of your Capex lifecycle. You spec it out for three years, for five years. Even if you did a perfect job, and it lasted exactly for the time slice estimated. Eventually, there’s going to be a project that comes along that tips the scales, and the expansion of the next three years is going to all fall on one project.

You’re going to have to justify scaling out, just because one project happens to need a little bit more space or a little bit more memory.

Kelly: Totally agree. Going back, you said you started with Snowflake a couple of years ago. There have been a lot of changes in the Snowflake world, a lot of changes in the ecosystem tools, technologies, capabilities. What do you wish maybe that you knew back when you started a couple of years ago, and what would you have changed knowing what you know today?

Sergey: I would say two things. One is what I alluded to earlier, Snowflake does not have indexes, which means there are no primary keys, there are no foreign keys, there are no integrity checks based on those nonexistent primary keys. There is no referential query elimination based on the fact that maybe you have a central fact table with many joins and you only want to hit the central fact table or maybe one attribute from one of those joins.

Snowflake is not going to be able to figure it out and strip away the excess memory based on that query usage. With BW, you’re used to a lot of hand-holding. BW does a great job of saying, “Wait a minute, this object is used in this model, even this data within this object is used in another model. Be careful before deleting it. Be careful before loading it.”

Whereas here, we went all-in on the modeling and didn’t realize that now we also have to control a little bit of the data quality on the way in and along the way, as well. That’s the first point. The second point is, again, the good and the bad of SAP BW is that all of the tools are there in one form or another. You can’t necessarily change them. At least not easily but they’re there.

There’s an ETL, there’s metadata governance, etc. Whereas with Snowflake, you have to either rely on third parties, or do it yourself with open-source software. There’s no enforcement of governance, best practices, naming conventions. In that sense, I think we were just so eager to build and start delivering that we forgot that DataOps is important.

We came in with a lot of DIY initially, and now we are realizing that it’s not too late to start down the DataOps path, but it’s something that we’re eventually going to have to dedicate time to and set up. Maybe something like dbt or anything that helps us step back and not do things as manually as we’re used to doing. We’re small enough that it’s not really been a problem, but we know that if we want to grow, or expand, it’s going to be something that we’ll need to invest time in.

Kelly: Those are two great pieces of advice, focusing on data quality, number one and number two don’t forget automation. We’re huge fans of dbt at Hashmap, they recently locked in their Series B round.

One of our architects says it best, “You can either go with dbt and Snowflake, or you can just build the equivalent of dbt on your own.”

You’re going to build it one way or the other. You might as well use an open-source tool like dbt that 3,000 other people are using.

Transitioning a bit, what would you say is the most challenging thing and the most difficult thing that you do in your current role today at Hotelbeds?

Sergey: I’m no longer locked into a predefined role as, for example, an SAP developer would be or an app developer or Basis developer. The fact that you’re on Snowflake means that you have the core platform but you also have other team members who are doing projects in Python, we have other team members who are starting to explore the ML space.

There’s a whole world of possibilities that you suddenly realize you have to, at the very least, stay abreast of and stay current. It’s thrilling but also a bit overwhelming at times because, again, everything has its learning curve.

Having the possibility of exploring a new technology or exploring even an evolving field like AI/ML without having to necessarily pay for licenses, pay for separate machines for MLOps or pay for infrastructure. You can just go in and do it with Snowflake.

Kelly: Best advice you could give to someone considering moving from BW to Snowflake. What’s that one piece of advice you could give them?

Sergey: Just do it. Don’t look back, do it.

Kelly: I love it.

Sergey: There’s really nothing to worry about. It’s liberating having the freedom to redefine your approach based on what makes sense going forward and not just what the infrastructure allows you to do in a system like SAP BW. You realize that you’ll surprise yourself with the solutions you come up with.

Kelly: Hey, Sergey, living on Mallorca there, I mean there’s got to be a variety of ways that you get out from behind the desk, get out from behind the laptop, and throw the mobile phone in the drawer, hopefully.

Any interesting things that you do outside of work?

Sergey: Yes, this is why I decided to go with the protein shake because it’s a reminder that there is life outside of work, and it’s important to keep the body as well as the mind active. I personally make an effort to get up once a day away from work and train and make sure that, what I call the centenarian Olympics. When I get there I’ll be ready.

Besides that, I’m a parent, I have a young son who has autism so I spend a lot of time with him, and that really takes up the bulk of my time training, parenting, learning —

Kelly: Sergey, I’ve really, really enjoyed having the chance to speak with you today. I know that everybody, whether they are on SAP BW today and considering Snowflake or just looking at Snowflake in general will benefit tremendously from the advice, the tips, and the guidance that you’ve given everybody with your journey over the last couple of years. I’m looking forward to keeping up with you, reading your blog posts, and keeping up with your progress at Hotelbeds.

Sergey: Really glad for the opportunity. Anybody who wants to reach out and has further questions, you can find me on LinkedIn. Check out the article if you still have any doubts. Feel free to ping me.

Kelly: All right, thanks again, Sergey!

Need Help with Your Cloud Initiatives?

If you are considering the cloud for migrating or modernizing data and analytics products and applications or if you would like help and guidance and a few best practices in delivering higher value outcomes in your existing cloud program, then please contact us.

Hashmap, an NTT DATA Company, offers a range of enablement workshops and assessment services, cloud modernization and migration services, and consulting service packages as part of our Cloud (and Snowflake) service offerings.

You can listen to this discussion between Kelly and Sergey on Hashmap’s Hashmap on Tap Podcast and subscribe on Spotify, Apple, Google, and other popular apps here:

Other Tools and Content For You

Sergey Gershkovich is a BI Architect at Hotelbeds with over 10 years of international experience with global clients. He specializes in Business Warehousing, Data Analytics, and Visual data design principles. His areas of expertise cover Data Warehouse modeling through SAP and Snowflake, Business Analytics (Data discovery, KPI analysis, dashboard visual design and best practices), and Data Quality/Governance including Agile methodology and technical rigor in enterprise environments. You can connect with Sergey on LinkedIn.

Kelly Kohlleffel is responsible for sales, marketing, and alliances at Hashmap, an NTT DATA Company, and delivers outcome-based data and cloud consulting services across 20 industries. He also co-hosts Hashmap on Tap, a podcast where special guests explore technologies from diverse perspectives while enjoying a drink of choice. He enjoys helping customers “keep it simple” while “reducing option fatigue” and delivering high-value solutions with a stellar group of technology partners. You can connect with Kelly on LinkedIn and follow him on Twitter.

--

--

Kelly Kohlleffel
Hashmap, an NTT DATA Company

Avid technologist, open-source software standard bearer, devoted husband and dad