Build a Trademark Search Engine with NoCode

Scott Martens
Jina AI
Published in
10 min readSep 8, 2022
Trademarks in the American food market. Source: Brett Jordan

Jina NOW is a Python-based complete solution for multimodal indexing and information retrieval. It supports several different mode pairs, but this article outlines the steps to implement a practical text-to-image and image-to-image search solution, with no code at all.

We will walk you through the steps to create your own text-to-image search engine with nothing more than directory of images and a Jina NOW installation. If you want to skip doing all these steps right now and just see the result, scroll down to the section titled Can’t we just skip ahead?

Requirements

  1. A UNIX compatible computer. MacOS is fine, as is Windows Subsystem for Linux, but not pure Windows. See installation advice on the Jina AI website for more information.
  2. An internet connection.
  3. An installation of Python 3.7 or higher. (Installation instructions from python.org.) Your installation of Python must include pip. This is usually installed when you install Python, but if not, follow the instructions on the Python website.

Assembling Trademark Data

Trademark registration in the US is conducted through the US Patent Office. Trademark images and descriptions form a part of the public record and are available via a bulk data download interface and a public search engine.

These records are in XML and JPG formats, and it takes some work to parse them. This is not a tutorial on processing USPTO public records data, so we have done the work for you. We have extracted 16612 images of US design trademarks (as opposed to trademarks that are just slogans or names) for which there was some registration action in August 2022. You can download them from Google Drive. Unzip the downloaded file to a convenient location in your local file system. The files will be in a directory called tm_designs.

You can inspect the images yourself. Most are black-and-white or greyscale, but a few are in colour. It includes a few famous logos, for example:

Nike’s “swoosh” logo

And:

The Starbucks “Siren”

And this particularly vivid full colour logo registered to a private person:

According to the application for trademark, this image is to be used to identify a company engaged in “Technology consultation in the field of cybersecurity”

Each image is in JPG format with a white background. The name of the file is the registration number of the trademark. For example, the file 73139391.jpg corresponds to US Trademark Registration number 73139391, and looks like this:

TM#73139391: A logo registered to the Hitachi Astemo corporation.

Install Jina NOW

At the command line, run:

pip install jina-now

Sophisticated Python users may want to perform this installation in a virtual environment, to eliminate the risk of incompatible dependencies, but it should not be strictly necessary.

Get a Jina AI Account

Signing up for a Jina account is optional. However, signing up is free and it gives you free access to our cloud hosting for easy deployment, scaling and monitoring of your search application.

The steps described in this article will likely take less time to run on Jina’s cloud than on your own computer.

To get a Jina account, go to Jina Hub and click the login button on that webpage.

Running Jina NOW

Once Jina NOW is installed, open a command line terminal and run:

jina now start

If Jina NOW is correctly installed, you should see this in your command line terminal:

Entry screen for Jina NOW

Text to Image Search

First, we’re going to build a search engine that takes text inputs, say “eagle” or “shoe”, and finds trademark images that match.

  • Select the first option in Jina NOW: text to image search

You should see a screen like this:

  • Select the third option: excellent

We could choose a different one, but this would give us less accurate results. You should then see a screen like this:

  • Select the last option: custom

You should then see something like this:

  • Select the third option, Local path, then enter the path to the tm_designs directory from the zip file of trademark images that you downloaded from Google Drive.

You will now get a screen like this:

We recommend you select Jina Cloud. The trademark data will be uploaded to Jina’s servers, which will index them and allow you to search them via an HTTPS REST interface, and from a webpage that you can use as a search “playground.” This requires you to have a Jina account, as described in a previous section.

The alternative — Local— will do the indexing on your computer and build the search engine into a Docker container, then install it and run it in Docker. If you choose this option, you must have Docker installed and running.

Indexing and running locally will almost certainly take much more time and consume significant local resources.

Let’s assume you chose Jina Cloud. You will get to a screen like this:

If you choose yes, access to the search engine will be limited to you and other users you name specifically. You should choose this option if you are working with your own data, but since this is public record data without security value, you can freely choose no.

If all steps have been executed correctly, you will get a screen like this:

Jina NOW will collect your data, upload it to the Jina Cloud and index it. This will take some time. Uploading may take several minutes, and indexing as long as a few hours.

When Jina NOW has finished uploading the data and starts indexing it, you will see a screen like this:

You should immediately take note of the ID string — c2f077f8a7 in the example above — because you will need it to access the index later.

It will likely take over an hour (and possibly several, depending on load) to finish indexing. If the Jina NOW program running on your local computer terminates, or the internet connection is lost, don’t worry. Your data is still indexing.

Accessing the Index

After indexing is done, you can query the trademark data using a REST API connected to the Jina Cloud. This REST API uses JSON for information exchange and you can build an application around it.

You can also access it via a “playground” to test how well it responds to queries. Using the ID string from above, the URL for the playground is:

https://nowrun.jina.ai/?host=grpcs://nowapi-<ID_STRING>.wolf.jina.ai&input_modality=text&output_modality=image&data=custom?utm_source=blog-trademark

Just substitute your ID string for <ID_STRING>in the above URL, and go ahead. Your index will remain installed on Jina Cloud for several days before it is automatically deleted.

Jina Cloud services are also available for commercial use. Please send an email to contact@jina.ai for specifics.

Can’t we just skip ahead?

Yes. Although the steps to build a text-to-image index are quick and simple, building an index for this many images takes some time. So, we’ve prebuilt an index for this trademark data that you can query right now.

If you follow that link, you should get a page like this:

https://nowrun.jina.ai/?host=grpcs://nowapi-6aef2e6720.wolf.jina.ai&input_modality=text&output_modality=image&data=custom&utm_source=blog-trademark

Enter text into the input field and press the Search button. For example, query for “dog and record player”:

You can see that the first result is a version of RCA’s famous His Master’s Voice logo. This search used no textual metadata. Jina’s AI recognises that the words “dog” and “record player” are good matches for the objects depicted in this image.

This playground application shows the nine best matches to your query, in order from best to worst match. Sometimes a query will return bad results because nothing in its index is a good match. Sometimes, the first few matches will be good — like for “dog and record player”— and the others much poorer.

Trademarks have formal text descriptions included in their filings. For example, the following is a text description of US trademark number 97112227:

The mark consists of a stylized hooded person typing on a laptop featuring an upside down squid with tentacles surrounding the person. The hood and cloak of the person is colored black featuring white and gray shading. The person’s face is colored blue and white featuring black shading. The eye is colored yellow. The hand is colored beige. The laptop is colored black, white, and gray. The red colored squid is outlined in black. The squid features blue, white, and orange shading. The squid eyes are colored yellow and black. The squid teeth are colored white and it’s mouth is colored black. Around some of the squid tentacles are black curved lines. The color white inside the tentacles represents background and/or transparent areas and is not part of the mark.

You can see what the mark looks like on the US Patent & Trademark Office website:

Let’s paste the entire text of the trademark description into Jina NOW:

You can see that it found precisely the matching trademark, based purely on intelligent image processing and parsing natural language text.

Image-to-Image Search on Trademark Data

As another practical application of Jina NOW’s search technology, let’s imagine you have a design that you would like to trademark, and want to find similar trademarks. You want to provide an image as input and find other images with similar content.

This is not a trivial problem. Imagine you provide an image of a grey dog, expecting to get back images of other dogs, and instead you get pictures of other grey things. You want your search engine to have human ideas of what “similar” means, by understanding the things depicted in the image.

You can do this with Jina NOW.

Open a command line terminal and run:

jina now start

Just as before, you should see this in your command line terminal:

Entry screen for Jina NOW

Move the cursor to the third choice — image to image search— and press enter.

Select “image to image search”

Then follow all the same steps as before for text-to-image search. When Jina Cloud has finished indexing, you can access your index via the URL:

https://nowrun.jina.ai/?host=grpcs://nowapi-<ID_STRING>.wolf.jina.ai&input_modality=image&output_modality=image&data=custom?utm_source=blog-trademark

Just substitute your ID string for <ID_STRING>in the above URL.

For this use case, we have also prepared a prebuilt index of the trademark data that you can use right away.

https://nowrun.jina.ai/?host=grpcs%3A%2F%2Fnowapi-a2a1fa0e46.wolf.jina.ai&input_modality=image&output_modality=image&data=custom&utm_source=blog-trademark

You can drag an image into this page, or browse your files to find one, and then find best matches in the trademark database.

For example, querying using a circa 1890 photographic version of the His Master’s Voice logo (affectionately known as “Nipper” from the name of the dog):

Jina NOW is able to recognise the objects in images, even when drawn or from very old photographs, and provide more intuitively human matches by basing its search results on its object recognition. This is very helpful in identifying stylistically close matches. For example, a parody of the famous Starbucks logo (found on Pinterest):

Even though there is a large difference in visible artefacts between the parody logo and the version of the original that happens to be stored in August 2022 Trademark Office records, Jina NOW is still able to find the correct match.

Try for yourself and join the Jina Community

Now that you see the power of Jina’s neural search framework, and the ease of constructing indexes and serving search results, you can experiment with your own data and decide if Jina NOW can add value to your business.

We would be happy to hear from you and talk about your use case. You can join our rapidly growing user community on Slack.

Learn more

Want to dig more into the Jina ecosystem? Here are some resources:

--

--