Laravel Scout (full-text search) P2: drivers & limitations

A deep dive into the Laravel Scout package, from beginner to advanced. Part 2 shows you limitations you might encounter and differences between each drivers.

Jori STEIN
9 min readFeb 9, 2023

This is a three part article series, makes sure to read in order to fully understand this topic :

Part 1 — Laravel Scout (full-text search) P1: Installation, configuration & searching (TntSearch)

Part 2 (this article) — Laravel Scout (full-text search) P2: drivers & limitations

Part 3 — Laravel Scout (full-text search) P3: combining search, filter and ordering

Summary

  • Why do we even need Laravel Scout
  • How a search is performed
  • Drivers in depth
  • Driver differences

Why do we even need Laravel Scout

First of all, let's see how you would search your data without using the Scout package. As you probably have already guessed and already done at some point, you can search directly using SQL and a MySQL/MariaDB database by using the LIKE statement:

    $search = "scout";

$articles = Article
::where('title', 'LIKE', '%'.$search.'%')
->get();

Which will generate the following SQL statement:

SELECT * FROM articles WHERE `title` LIKE "%scout%";

It will find all articles where the title contains the word scout exactly. And this would work, and it is fine to do so. Now it's important to understand that there are some limitations to this approche :

  1. You are giving too much responsibility to your controller: you shouldn't define the list of columns to search on directly in your controller. First because it's harder to maintain but more importantly this should be abstracted into a separate service and be configured somewhere else.
  2. Results are too strict: so if the user makes a typo, they will not find what they are querying. The reality of today's world is that users makes mistakes and it's up to the program to adapt, not the other way around. Think of how many times Google displays under your input Did you mean : xxxx and you click on it, because you know it was correct.

2. You might have performance or I/O issues if you have lots of data. For MySQL to perform a LIKEstatement it needs to read (called "scanning") every single line of your table, so if you have 2 millions lines and it returns only 2 results, 1 999 998 other lines still have been scanned. So depending on the size of your database and the size of your text column, you will see performance drops.

3. Results do not take into account the "relevancy" of each result, let me explain. Results returned from your MySQL database will be ordered as they are stored in the database, so for example if you search for "Ben", "Benjamin"s might appear before the first "Ben", which is probably not what would be expected by the user.

As a conclusion, what I am saying is that you are constrained to searching using your database, which is probably fine for testing or your local environment, but MySQL and MariaDB doesn't offer a solution that is performante enough and users will not be satisfied with your search results.

This is where Laravel Scout comes in, it will delegate the task of searching to a dedicated service. It also allows you to use different compatible drivers.

How a search is performed

When searching your data, you are adding a new step : searching (obviously). This has to be done before the data is retrieved from your database, in a nutshell here's what's happening:

  1. A search is initiated when ::search('john') is called, the query is sent to the selected driver.
  2. The driver performs the search on its indexed data.
  3. The driver makes a list of candidates and returns it by giving the primary keys.
  4. Then the data is retrieved from your database by constraining the SQL query to the list of candidates by using theWHERE INstatement.
  5. Finally, the data is returned as an Eloquent collection to the ::search('john') function.

Drivers in depth

Let's now go over each compatible driver for Laravel Scout, understand how they work, what they offer and their pros and cons.

Reminder: A driver in Laravel is a way of telling your application what service it should use during runtime without changing your code base. You've already configured some drivers, like your cache, your queue connection, and others.

Laravel Scout supports multiple drivers out of the box, in your .env you may set SCOUT_DRIVER to your desired drivers, here's the list :

  • null — This driver does nothing, it will not index your data, always return an empty search result. Think of it as "disabled".

Pros:

- In an extreme case where your current search engine is crashing your application, disabling it temporarily with this driver could be a solution.

Cons:

- As this does literally nothing, there are no cons.

  • collection —This driver will not index your data, instead it will brute force by looping over every model and try to find a match. This means it needs to load every model from your database, because it needs to call the toSearchableArray() function of your model. This can be used for developing purposes, simply make sure you don't seed too much data.

Pros:

- This driver is perfect to run tests and for your development environment. It is very reliable as it doesn't use big algorithms so results will always be the same.

Cons:

- It is not performant at all as it will slow down your application by a large factor.

- Results are not pertimente, it doesn't take into account typos and results are returned in the same order they are saved in the database.

  • database — This driver delegates the search to your database by using SQL statements (so WHERE col LIKE xor MATCH (col) AGAINST (x)). Data doesn't need to be indexed by Scout as this is managed by your database.

Pros:

- This driver is perfect for small to medium projects, your database is already optimised to handle large amounts of data.

- It is compatible with MySQL and PostgreSQL full text indexes, which if you don't know will greatly improve the performance of the query but will change how the search works (I personally don’t like "full text indexes" and don’t recommend using it, but this is out of the scope of this article so it's up to you to experiment).

Cons:

- Results are not pertimente, it doesn’t take into account typos and results are returned in the order they are saved in the database.

- ⚠️ Keys returned by toSearchableArray() must exist as columns in your model. This means you cannot search from a relation or an aggregated value. There is a way to circumvent this issue though (see Part 3).

  • algoliaAlgolia is a great, powerful, easy to use, lots of available config. It is a paid service but it does have a free plan, which might be enough for small project or local testing. They host everything on their servers so you don't need to install and run any external services.

Pros:

- There are no server or services to maintain which makes it easier to build your local and production environment.

- Algolia offers lots of parameters to tweak like synonyms, ranking, AB testing and so much more.

- Search can be done from your front-end directly to algolia, skipping Laravel altogether. This will require some changes to your implementation though.

Cons:

- You might end up paying if you have lots of data or doing lots of queries.

- Your search will be slower if you use Laravel Scout, as two request are made: one to your server and a second to Algolia's servers from your server (which really might be negligible).

  • meilisearchMeilisearch is it open source with lots of potential. You may install this on your server or host it through their paid service. As there is no interface, all parameters can be defined in a Laravel config file and then synced with a simple command.

Pros:

- It can handle a large amount of data, you will not have any performance issues.

- All configured parameters follows your project as you can save them in your config/scout.php file. Ne need to export and import between environments.

- Results are very good and pertinent, knowing it's an open source project.

Cons:

- Requires you to boot a meilisearch instance, which adds an extra layer to you deployment process. And is another service you'll install on your computer and forget about for the next few years.

  • tntsearchTntsearch is a project from the teamtnt team, it is a PHP lightweight solution and good enough search engine. Everything is indexes locally to your application, so there is nothing to install. This is the only driver from this list that is not natively supported by Laravel.

Pros:

- Everything is done locally, so if you can run your Laravel application, you can use this driver with no extra configurations or installations. It takes into account typos and has a few features, like boolean search.

Cons:

- As files are store locally to your instance, you cannot use a load balancer because files won't be shared between multiple instances. But in truth, if you need a load balancer, your application is probably big enough that you should consider using a more powerful driver.

- Results can sometimes be unexpected, there are some option you can tweek, in my personal experience I have never found a perfect fit, but it's negligeable.

Driver differences

Finally, I want to go over the differences between each driver and the constraints you might encounter. Laravel is built in such a way you can change driver without updating your codebase, for example, you can change you cache driver to database, array, file or redis at any moment and it will work without.

The same is not true with Laravel Scout drivers, you cannot change to any drivers at any given time, there are some key differences that will make then incompatible. This means it is important for you to decide on one driver to focus on and develop your code around.

Let me show you with some examples:

  1. Imagine you need to index a column from a relation because that's how you database is built:
<?php

namespace App\Models;

use App\Traits\Searchable;

class Post extends Model
{
use Searchable;

public function toSearchableArray()
{
return [
'id' => $this->id,
'title' => $this->title,
'short_description' => $this->metas?->short_description,
];
}

public function metas()
{
return $this->hasOne(Meta::class);
}
}

This will work for every driver except for database. You will get an sql error because it won't find the column "short_description" in the table Post. This is because it doesn't index the columns, instead it only retrieves the list of keys and sends to your database with the following SQL statement :

SELECT * FROM posts
WHERE `title` LIKE "%searchQuery%"
WHERE `short_description` LIKE "%searchQuery%"

It will return that "short_description" column exist, as we saw just because, this column is on another table retrieved through a relation.

2. Searching with the value null or '' (empty string), will actually return all items and not an empty array. This is very important, as this will often be the default value when loading your page or when the users clears the field. The issue comes with the driver tntsearch , for an unexpected reason this driver returns an empty array if null or '' (empty string) is given.

User::factory()->count(15)->create()

$search = '';

// meilisearch
User::search($search)->get()->count(); // 15

// tntsearch
User::search($search)->get()->count(); // 0

// Instead you need to conditionnally search
// with the driver tntsearch, which
// is not a good idea truthfully
if ($search) {
User::search($search)->get()->count();
} else {
User::query()->get()->count();
}

3. This last one is related to something we will see in part 3 but I know you are smart enough to still understand.

With algoliaandmeilisearch it is possible to configure your indexed data as such you search only on certain columns and use the others for sorting (or ordering) your results. Which means you are indexing data for other use than searching. Imagine the following example:

<?php

namespace App\Models;

use App\Traits\Searchable;

class Post extends Model
{
use Searchable;

public function toSearchableArray()
{
return [
'id' => $this->id,
'title' => $this->title,
'created_at' => $this->created_at,
];
}

You can configure algoliaor meiliseach to search only on the column "title" even though you have also given the column "created_at". Now if you use another driver (for your local environment for example), you will suddenly be searching on all columns so results will be very different:

User::factory()->count(15)->create(['created_at' => '2023-01-01']);

$search = '2023';

// meilisearch
User::search($search)->get()->count(); // 0

// database
User::search($search)->get()->count(); // 15

Conclusion

Laravel has found a way to make multiple drivers compatible with your code but we've seen there are some key differences which constrains you to take a decision early on which driver you will be using. Also, you should start realizing that such feature comes with some caveats and you have to change your code to work around those, so you should keep things simple in your code when searching.

--

--

Jori STEIN

Software developer, leading a path to code heaven. Let's do this together !