Imagen 3 generated images for restaurant/delivery apps: Eeatingh case study
Product images can make the difference between a successful sale and traffic that materializes into nothing. But what happens when they are missing? With the help of AI technology, the Eeatingh app now comes up with an innovative solution: automatically generating stylized product images, transforming simple, already-available data into eye-catching visuals that enhance the appeal of any menu. Let’s see how in the following case study.
Credits: REEA.net is a romanian software company, the author Márton Kodok leads the Google Cloud platform engineer team, Eeatingh platform made by REEA.
The challenge of missing product images
The image is most often the first thing that grabs a potential customer’s attention. All the more true in restaurant apps, where users make decisions based largely on the visual appearance of dishes. Quality visuals play a key role in creating an attractive and compelling user experience, directly influencing sales. Without them, products appear incomplete, less appetizing, which can significantly reduce the chances of them being ordered.
Unfortunately, many restaurants struggle to provide images for all menu items. The main reasons are a lack of resources to produce professional photo shoots or the inability to quickly update images as the menu changes. The lack of these images not only affects the user experience, but can also create an impression of carelessness or unprofessionalism on the part of the restaurant.
Meet the customer — the Eeatingh platform
Eeatingh — by REEA, is a romanian local online platform that has been delivering your favorite food straight to customers’ doorsteps since 2014. In its 10 years of existence, Eeatingh has successfully competed in the local market with big players such as Food Panda, Glovo and Tazz, maintaining its position through an agile and flexible business model.
Despite the presence of these giants, Eeatingh has thrived by offering a digitized home delivery service at no extra cost to the end customer, allowing users to enjoy delicious meals without leaving the comfort of home.
Our solution: automatic AI image generation
To solve the problem of missing product images, we developed an innovative solution that uses artificial intelligence to automatically generate stylized images based on product data in the Eeatingh app. Through an API built in Python, our app now utilizes a generative text-image AI model that transforms simple product information (such as name and description) into attractive images in a drawn or vector style.
The icing on the cake: the process doesn’t require manual intervention to create detailed prompts, as the API automatically picks up the product data and sends it to the AI model. Thus, restaurants that don’t have images for their entrees can benefit from a fast and efficient solution integrated directly into the Eeatingh platform, without any hours of prompt-writing training. Instead, it’s all done with a click of a button: within the admin section, there’s a new button that allows images to be automatically generated, offering a practical alternative to uploading photos manually.
The technology behind this solution includes Google Cloud — Vertex AI and integration with the Imagen 3 image generation model. This model utilizes an intermediate step with a Large Language Model (LLM), which expands short product names into detailed descriptions, including relevant information such as ingredients, how to prepare or serving suggestions. This data expansion significantly improves the quality of the generated images.
As an aside, it is worth mentioning that Imagen 3 was recently launched in August 2024, and Reea was invited as a partner to test this technology, providing feedback and receiving suggestions and best practice models in its use from Google engineers.
Through this technology, restaurants not only save time and resources, but are also able to provide users with an enhanced visual experience, thus helping to increase the attractiveness and conversion rate of their products.
How does the image generation API work
The process starts with submitting product data — product name and description — to the API. This data is then processed by a large language model (LLM), which extends and enriches the initial information. For example, a simple product name such as “margherita pizza” can be transformed into a detailed description that includes information about the ingredients (mozzarella cheese, tomato sauce, etc.), the cooking method (baked on the stove) and the serving style (with fresh basil). This step is essential to generate the most realistic and relevant images.
Once the description has been generated and optimized, the API sends this data to the Imagen 3 model, which creates the product image. Imagen 3 is known for its ability to generate high quality images with a high level of detail and realism and is optimized to deliver fast and accurate results.
Our API can be used by multiple applications or websites and is designed to be flexible and easy to integrate. The associated costs are low due to the use of cloud services and optimizations in place, and for a high volume of images, the price per image can drop considerably, making this solution affordable for businesses of all sizes.
This entire image generation process takes place in the background, without requiring user intervention in the administration section of the Eeatingh application. Thus, when a product does not have an image uploaded, a new image can be automatically generated with a simple click, providing a fast and efficient solution for filling out restaurant visual menus.
Challenges encountered and solutions found
While automatic AI image generation brings significant benefits, the process of developing and integrating the technology was not without its challenges. Here are the biggest ones:
1. The challenge of short or traditional product names
One of the main challenges encountered was related to products with short or very generic names, such as “pizza” or “pasta”. These simple names did not provide enough context to generate a clear and detailed image, leading to average or irrelevant images. Another problem arose with traditional dishes with regional names, such as “tochitură” or “mămăligă”, which did not translate efficiently or were not correctly recognized by the AI models.
Solution: we implemented a large language model (LLM) that extends and enriches short names by automatically generating detailed descriptions. This naming expansion process includes specific ingredients, preparation methods, and serving suggestions, providing the AI model with enough information to generate more accurate and relevant images. Also, for traditional products, prompt enhancement mechanisms have been added, which can include cultural descriptions or examples of similar dishes from other cultures.
2. Text on generated images
Occasionally, some generated images contained superimposed text, which was undesirable, especially in the context of menu images for restaurants. It must be said that this is a specific problem encountered in generative models that are trained on data that also includes textual elements.
Solution: We added a quality control filter to the generated images, which detects and removes images with textual overlaps. Users are also provided with a feedback mechanism where they can flag inappropriate images, helping to continuously improve the generation and filtering process.
3. Managing images for products with very similar names
In some cases, products with very similar names (such as “pizza margherita” and “pizza quattro formaggi”) resulted in the generation of almost identical images, even though the products in question have significant visual differences.
Solution: we implemented an algorithm that takes into account key differences between products when expanding textual descriptions, ensuring that the AI model generates distinct and relevant images for each product. By including details such as specific types of cheese or different cooking methods, we were able to diversify the generated images even for similar products.
4. Cost and real-time performance
Generating AI images can be expensive, especially for large volumes of products. The speed of image generation is also critical, as restaurants need to be able to quickly add new dishes to menus without significant delays.
Solution: We opted for Google Cloud integration and the Imagen 3 model, which offers the optimal balance between quality and speed. The Imagen 3 Fast allows images to be generated in less than 5 seconds for a set of 4 images, which is ideal for restaurant needs. In addition, utilizing cloud services allows us to keep costs under control, ensuring a competitive price per image generated, with costs reducing as volume increases.
By solving these challenges, we were able to optimize the image generation process and provide an efficient, scalable and affordable solution for restaurants using the Eeatingh app.
Use cases and long-term potential
The automated AI image generation technology we’ve implemented in the Eeatingh app offers obvious immediate benefits, but its true potential is far greater. Here are a few examples:
- E-commerce and retail
One area with huge potential is e-commerce. Online stores, especially those with a large inventory of products, often struggle to present quality images of every item in stock. AI’s automated image generation solution can be used to quickly create attractive images for products that don’t have photos available. Whether it’s fashion items, technology, or even handcrafted products, this technology can turn textual descriptions into visual images that attract customers. - Hospitality and tourism
The hospitality and travel industries can also capitalize on automated AI image generation. Hotels can create images for rooms, facilities or special service packages based on their descriptions. Travel agencies can also generate images for different vacation packages or destinations that do not have real or updated photos, providing a visual representation to attract tourists. - Online education and training
Online education and training platforms can use this technology to create compelling visuals for courses, tutorials or educational materials that do not have images by default. For example, a course on cooking can utilize AI-generated images for each recipe or ingredient, giving learners a more engaging learning experience.
Long-term potential
In the long term, AI image generation technology can become a standard tool for any business that relies on visual product presentation. Continuing developments in AI models, such as those offered by Google Cloud — Imagen 3, will enable the generation of increasingly realistic and detailed images at lower costs and faster speeds. Also, integration with platforms such as social media or marketplaces will enable the automatic generation of images for posts, advertisements and product pages, eliminating the need for external visual assets.
Besides, at Reea we have already experimented with other tools, including in commercial projects where we have used AI models such as DallE, Midjourney and Stable Diffusion 3 in logo design, marketing content, social media posts, etc. already demonstrating the potential of these technologies in multiple domains.
Thus, from restaurants and online stores to hotels and educational platforms, automated AI image generation will continue to expand its applicability, becoming an integral part of many industries’ marketing and sales strategies.
Wrap Up
In the meantime, if you want to check it out, here are some links:
Feel free to reach out to me on Twitter @martonkodok or read my previous posts on medium/@martonkodok
We at REEA are open to providing specialized knowledge, software development to your Google Cloud needs. Get in contact with Marton to kickstart discussions.