Optimizing Meituan’s Machine Learning Platform: An Interview with Jun Huang

Benedikt Schifferer
Published in
6 min readSep 29, 2021


By Sheng Luo and Benedikt Schifferer

Introduction: Sharing Insights and Best Practices

Recently, deep learning based recommender systems have become the norm for large scale industry applications. NVIDIA cooperates with the broader recommender community to accelerate and scale neural networks for recommender systems by developing NVIDIA Merlin, an open source framework to make building deep learning based recommender systems easier and more accessible. Meituan is China’s leading e-commerce platform for services with 630 million users. Jun Huang, Senior Technical Expert, optimizes the machine learning platform at Meituan. We asked him to share his experience in developing and scaling recommender system pipelines with the latest NVIDIA A100 GPUs.

Interview with Jun Huang, Senior Technical Expert, Meituan

Question: For North American and European audiences that may not be aware, what is Meituan? How many users and merchants are using Meituan? How many transactions happen on the Meituan technology platform?

Jun Huang: Our mission is: “We Help People Eat Better, Live Better.” As China’s leading e-commerce platform for services, Meituan’s business revolves around the “Food+ Platform” strategy, and is centered on “eating” as its core. Meituan operates several well-known mobile apps in China, including Meituan, Dianping, Meituan Waimai and others. Its business comprises over 200 service categories, including catering, on-demand delivery, car-hailing, bike-sharing, hotel and travel booking, movie ticketing, and other entertainment and lifestyle services, covering over 2,800 cities and counties across China. Meituan has 630 million active users and 7.7 million active businesses every year. Each user has 32.8 transactions per year.

Questions: What is your role at Meituan?

Jun Huang: I am a senior technical expert of Meituan, mainly responsible for the training framework team of Meituan Machine Learning Platform. Our training framework covers a large number of deep learning scenes in Meituan, including: recommendation system, NLP, CV, ASR, automatic driving, etc.

Question: What does your team at Meituan work on?

Jun Huang: Our team developed a stable and high-performance distributed deep learning training framework. Our system can be deployed in large-scale CPU/GPU clusters, and supports failover and auto scaling. In the recommender system scenario, we support distributed training with ~100 billion sparse parameters and ~100 billion samples, and also support online learning. In NLP scenario, we support distributed training with ~10 billion parameters on hundreds of GPUs. Recently, we have designed the next generation of recommender training system based on NVIDIA A100, which greatly improves the training efficiency and the model complexity.

Question: How does your work and your team’s work on recommenders relate to Meituan’s overall business?

Jun Huang: Our training framework covers the model training of search, recommendation and advertising scenarios, which are the overall traffic business of Meituan. We helped Meituan accelerate the growth and realization of flow.

Question: Is your team a relatively new team? Why did Meituan decide to invest in recommenders?

Jun Huang: Our team has been established for many years and is the infrastructure team of Meituan. Search, recommendation and advertising are the most important business flows in Meituan, which will bring more user growth and business realization.

Recommendation System Workflow Questions

Question: What kind of recommender systems does your team focus on?

Jun Huang: For the recommendation system, our team mainly focuses on the optimization of model training. We designed an efficient data flow system, which is convenient for reading offline and online features flexibly and efficiently. We optimize the system from the perspective of scaling out, and make the system expand nearly linearly. We also optimize the system from the perspective of scaling up, and fully utilize the hardware resources. We also do jointly optimization for training performance and accuracy from the system and algorithm perspectives. We continue to pay attention to SOTA of new models and new hardware, and bring valuable technologies to Meituan’s business.

Question: As Meituan has 630 million users and many interactions per user, how does your team conduct training? How frequently do you train?

Jun Huang: Usually, a small number of samples are used to try the algorithm strategy first. If the verification is effective, a large number of samples are used to experiment the above strategy, hoping to get better results. Every day, there are various experiments, but for online models, the training frequency varies from one day to one week.

Question: How does your team evaluate recommendation systems? Fine tune?

Jun Huang: In our recommender system, we generate a series of models by experimenting with various model structures, algorithm strategies and sample features, and judge whether these models can bring the improvement of business indicators through offline and online evaluation. Usually, training larger samples and more complex models will improve business indicators, but this will consume more resources and reduce the number of experiments. We will balance this, but if we can greatly improve performance, we usually prefer to train more samples and more complex models.

Question: How do you optimize your recommender systems?

Jun Huang: At first, we optimized the training framework based on CPU architecture, but as our models became more and more complex, it was difficult to optimize the training framework deeply. Now, we are working on integrating NVIDIA HugeCTR into our training system based on A100 GPUs. A single server with 8x A100 GPUs can replace hundreds of workers in the CPU based training system. The cost is also greatly reduced. This is a preliminary optimization result, and there is still much room to optimize in the future.

Question: How do you choose the appropriate technique, package, method, or frameworks to support your work?

Jun Huang: This technology needs to be advanced, open and ecological, so that we can fulfill our internal needs better based on it.

Question: How important is open source technology and interoperability to your team’s work?

Jun Huang: Very important. Our team is currently building our system mainly based on open source technology. At the same time, we are very happy to give back our work to the open source community.

Question: Recently Meituan reported an average increase of 32x more transactions per user for the trailing 12 months of Q2 2021. How does your team handle scaling your models?

Jun Huang: We used more samples and more complex models to express our business model, which greatly improved our business effect.

Question: What is a recent success for the team?

Jun Huang: Based on A100+HugeCTR, the training framework for the next generation of recommender systems is developed, and achieves the preliminary results.

Question: Have you recently integrated specific methods, techniques, plugins, libraries, or packages into your recommendation systems workflow?

Jun Huang: Made the whole data flow pipeline, from the remote distributed file system to the local CPU memory, then to the GPU memory, so that the computation and IO can be overlapped. Embedding Layer and Dense Layer are also pipelined, which makes embedding layer no longer be the bottleneck of the whole system.

Question: If a team lead was just starting a team and evaluating building, deploying, and optimizing recommendation systems for their company…..what advice would you relay to help them accelerate or streamline their recommender systems workflows?

Jun Huang: Fully understand the current company’s infrastructure and business status, and design systems and processes based on it. When choosing the technology stack and framework, consider the maturity, community ecology, extensibility and integration friendliness of each system.

Additional Community Resources to Consider

NVIDIA develops an open source framework to make recommender systems easily accessible to everyone. In addition, we share the best practices, insights and learnings from our cooperations with innovative leaders like Meituan. Check out our resources about recommender systems:



Benedikt Schifferer

Benedikt Schifferer is a Deep Learning Engineer at NVIDIA working on recommender systems. Prior, he graduated as MSc. Data Science from Columbia University

Recommended from Medium


See more recommendations