The World According to Ozlo

Building knowledge graph for digital assistant using machine intelligence.

The days when printed media and libraries were the main sources of information are long gone. Keyword search and rigid catalogs are similarly yielding to human-like information discovery and exploration. Digital assistants that converse, understand local context, and comprehend concepts are at the forefront of this development.

Ozlo’s digital assistant platform is advanced by machine intelligence which stands on four technology pillars: (1) Data, (2) Parallelism, (3) Open source, and (4) Algorithms. Data come in different types and sizes. They are central to machine intelligence to discover and predict patterns. More relevant data usually improve the predictions and generally allow more expressive models to be employed. Parallel computing is essential for the fast data processing and model training. By sharing technology and disseminating information, open-source development has a crucial role in enabling and accelerating the progress in machine intelligence. Apache Hadoop and Spark are popular examples of open-source projects. Machine intelligence is increasingly based on machine learning algorithms. Better algorithms have lead to new applications with impressive performance.

Four technology pillars facilitate digital assistant.

Knowledge Graph

The four technology pillars are integral to building Ozlo’s brain, the knowledge graph. It encompasses everything our system knows about the world and is machine learned from multiple public and private data sources. The knowledge graph has entities as nodes — places, people, general objects — and relations as links between the nodes. A node is a collection of properties which are facts or probabilistic labels given the evidence per source data. In the case of movie entities, properties are title, release date, genres, and ratings, and relations are links to other entities such as artists, awards, and streaming services, for example.

“Find good Vietnamese restaurants near library in Milpitas.’’

The knowledge graph allows the systems we power to answer questions like “Find good Vietnamese restaurants near library in Milpitas”. The question-answering process happens in multiple steps. First, the natural language processing algorithms parse the sentence and extract the intent and other relevant information. For example, we interpret with a high likelihood that in the sentence Vietnamese is a cuisine, restaurant is an eatery type, library is a landmark in the nearby city of Milpitas. The identified facets and names are used to retrieve the relevant content from the knowledge graph. A list of results is returned back to the user with a human-like response.

Data Platform

Ozlo’s knowledge graph is a product of a data platform that utilizes cloud computing and storage. The data platform is responsible for ingesting, transforming, and loading data from external and internal sources to the knowledge graph. The transform stage is a computationally intensive and challenging phase. Its primary tasks are to remove duplicate entities per source, resolve instances of entities from multiple sources, fuse properties of resolved entities, and compute semantic representations of entities.

Schematic Architecture of Transform Stage

Deduper and Resolver

The dedupe and resolve tasks use quad trees and locality sensitive hashing for fast indexing of entities and gradient boosted decision trees for deduping and resolving. The performance is estimated by cross-validation:


The fuse task takes resolved entities as an input and conflates their properties. Some properties are simple facts and some are subjective opinions. This task is needed because the data sources often disagree about the entities’ properties. Nonetheless, the conflicting evidence has to be conflated to a probabilistic property. Because property values are continuously changing, it is difficult to produce a high-quality training set to learn the conflation function. Thus, we have resorted to an unsupervised approach whose main goal is to conflate evidence in the presence of noise. It is based on high-level rules that guide factorization machines to de-noise and estimate labels for properties.

“Find restaurants nearby that serve pho.’’

Ozlo relies on conflated properties, which can be place types, cuisines, dishes, etc., to rank answers to users’ questions. For example, to answer the question “Find restaurants nearby that serve pho”, we need to find and rank entities that are restaurants and serve pho. Ranking uses properties’ probabilities to determine the correct ordering of entities that the user sees.

Semantic Information

Properties’ probabilities express the fuser’s confidence that entities have these properties. However, it is frequently difficult to characterize entities in simple terms because of a large number of properties. A restaurant may serve pho but it does not imply that it is mainly a Vietnamese restaurant. Semantic information about entities enables their semantic comparisons and a new class of questions to be asked. Furthermore, in addition to entities, we can also compare concepts and objects if they have semantic information.

“Find restaurants like Thaiphoon in Palo Alto in Seattle.’’

Semantic information can be computed by a separate task after the fuser or it may suffice to reuse the latent vectors the fuser’s factorization machine has already computed. As a result, Ozlo would be able to answer questions “Find restaurants like Thaiphoon in Palo Alto in Seattle” or “Find restaurants like Bottomless Mimosas in Denver”. This will simplify the formulation of questions when we instinctively know what we are looking for but may not know how to succinctly describe by words. Incidentally, the top three entities for the first question would be: Thai Heaven, Phayathai Cuisine, and Thai Siam, and for the second one: Great Northern, Rialto Cafe, and Altitude Restaurant.

Semantic similarities can be computed between many types of objects. Here are a few examples:

  • Mediterranean cuisine is similar to cuisines Greek, Turkish.
  • Vietnamese cuisine is similar to dishes Bun Cha, Pho, Bun Bo Hue, Bun, Bun Rieu, Banh Xeo, but dissimilar to Italian Chicken.
  • Irish Pub eatery is similar to dishes Irish Whiskey, Honey Wine.
  • Family genre is similar to genres Children, Preschool, Education, Pets, but dissimilar to adult and violence related genres.
  • Science Fiction genre is similar genres Alien and Space, but dissimilar to religion and biography related genres.


In production, the knowledge graph is on average updated daily. Models are re-trained and deployed when new training data arrive. As the data platform improves and matures, the lifetime of the models shorten. The deduper and resolver models can be automatically re-trained in minutes and the fuser models are only kept for the duration of the execution of the transform stage.


Machine intelligence is not a magic bullet that will easily solve every digital assistant problem. Nonetheless, while we periodically have to support methods for humans to override a problematic behavior, we are always looking for solutions that push the limits of the state-of-the-art. Fertile areas for new advancements are the intersections of the technology pillars. An example is the minimization of a likelihood function by stochastic gradient descent. By iterating the equations maximally concurrently without locks, we achieve orders of magnitude speedup with additional regularization. Another example is the blurring of the boundary between supervised and unsupervised machine learning allowing new algorithmic approaches to be developed.

In search of better products and solutions it makes sense to embrace the progress and challenge the conventional wisdom.