Array-Based Search in the Spring Framework

Tinytella(トリン)
Goalist Blog
Published in
6 min readOct 6, 2023

“Well, I guess it’s safe to say that you’ve encountered this requirement for a search API at least once — finding objects by an array of keywords.”

Even so, I still can’t seem to remember it and make the development process less painful each time. Therefore, I would like to share some common cases through this article.

Technologies I am using:

  • Java 17
  • Spring Boot 3.0.6

1. JPQL Queries

If the condition is not too complex, we can simply write it in JPQL like this:

@Query(value = """
from manuscripts m join post_labels lb
on m.id = lb.manuscript.id where lb.label.id in :featureIds
""")
List<Manuscript> findByFeatureLabelsIn(List<Long> featureIds);

Notes: Make sure that the List parameter do not have NULL case, otherwise you might get wrong result.

If you don’t have the option to avoid passing NULL values through the List parameters, you might consider using COALESCE.

FYI, the COALESCE() function returns the first non-null value in a list.

JQPL

However, it doesn’t appear that COALESCE works well with JPQL. 🤷

Here is an error message you might have seen in this case.

06-10-2023 20:44:19.963 ERROR [http-nio-8080-exec-3] --- o.h.e.jdbc.spi.SqlExceptionHelper.logExceptions [138] : Operand should contain 1 column(s)
org.springframework.dao.DataIntegrityViolationException: JDBC exception executing SQL [select r1_0.id,r1_0.title from manuscripts r1_0 join post_labels m1_0 on r1_0.id=m1_0.manuscript_id where (r1_0.is_deleted=0) and (coalesce((?,?)) is null or m1_0.label_id in(?,?))];

2. Native Queries

Don’t give up; fortunately, you can write it in a Native Query. 💁

For those who haven’t known yet, we can add ‘nativeQuery = true’ to switch JPQL into Native Query mode.

Native Query

Well, it looks good with Native Query, right? At least, we don’t see the red line in syntax by IntelliJ compiler. However, it’s different from what we expected; it still gives us an error.

06-10-2023 21:04:27.182 WARN  [http-nio-8080-exec-4] --- o.h.e.jdbc.spi.SqlExceptionHelper.logExceptions [133] : SQL Error: 1064, SQLState: 42000
06-10-2023 21:04:27.182 ERROR [http-nio-8080-exec-4] --- o.h.e.jdbc.spi.SqlExceptionHelper.logExceptions [138] : You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'null' at line 4
org.springframework.dao.InvalidDataAccessResourceUsageException: JDBC exception executing SQL [ select m.*
from manuscripts m join post_labels lb
on m.id = lb.manuscript_id where
coalesce(?) is null or lb.label_id in ?
]; SQL [n/a]

It is showing that the statement still doesn’t accept NULL after the IN operator. Let’s pass the SQL statement that we obtained from the console into the SQL Editor to figure it out.

MySQL Workbench

Okay, so here is the memorable thing to save you time when working with COALESCE and the IN operator: please make sure you wrap your parameter inside parentheses like this (:featureIds).

    @Query(value = """
select m.*
from manuscripts m join post_labels lb
on m.id = lb.manuscript_id where
coalesce(:featureIds) is null or lb.label_id in (:featureIds)
""", nativeQuery = true)
List<Manuscript> findByFeatureLabelsIn(List<Long> featureIds);

The fun fact is, I used to use COALESCE in Java 11, and when I switched to Java 17, I thought maybe it no longer works with the current version (17). So, it took me a while to struggle in finding other ways before I came back to realize what the problem was with this powerful function (in my opinion).

Up to now, you might think that Native Query is enough. Have you ever had to write a search SQL with more than 10 fields that need to be compared? Well, it’s not a good idea to write it all in one query, I guess.

3. Advanced Spring Data JPA — Specifications

Besides the two methods mentioned above, you may already be familiar with the Criteria API.

As the author is saying, to be able to define reusable Predicates, they have introduced the Specification interface. Although it’s not stable yet, it’s still worth giving it a try. 😉

Of course, there are many interesting aspects that you can explore, but in this post, I will just focus on how to deal with List parameters.

Before using it, I will have to add it to the Repository layer. Here, I’ll give you an example of how complex SQL I have to write, and then I’ll transform it into Specifications.

@Repository
public interface ManuscriptRepository extends JpaRepository<Manuscript, Long>, JpaSpecificationExecutor<Manuscript> {

@Query(value = """
select m.* from manuscripts m
join occupations occ on m.occupation_id = occ.id
join company_locations loc on m.location_id = loc.id
join post_labels lb on m.id = lb.manuscript_id
join companies c on c.id = loc.company_id
where m.start_date <= current_time() and m.public_status = 'PUBLIC'
and (:keywords is NULL or m.title REGEXP (:keywords))
and (:keywords is NULL or m.job_description REGEXP (:keywords))
and (:keywords is NULL or loc.address REGEXP (:keywords))
and (:keywords is NULL or c.companyName REGEXP (:keywords))
and (:keywords is NULL or c.introduction REGEXP (:keywords))
and (:keywords is NULL or c.summary REGEXP (:keywords))
and (:companyId is NULL or c.id = :companyId)
and (COALESCE(:occupations) is NULL or m.small_occupation_id in (:occupations))
and (COALESCE(:prefectures) is NULL or loc.location in (:prefectures))
and lb.manuscript_id in (
select distinct label2.manuscript_id from post_labels label2
where (COALESCE(:featureIds) is NULL OR label2.label_id in (:featureIds))
group by label2.manuscript_id
having count(label2.label_id) >= :featureCount)
""", nativeQuery = true)
List<Manuscript> countManuscriptBySearchCondition(
String keywords,
Set<Long> smallOccupationIds,
Set<String> prefectures,
Set<Long> featureIds,
Long companyId,
int featureCount
);
}
  1. Specification Definition
public class ManuscriptSpecification {
private ManuscriptSpecification() {
}

//TODO: write many predicates depending on how many conditions we have
public static Specification<Manuscript> keywords(List<String> keywords) {
// return your predicates
}

public static Specification<Manuscript> publicStatus(ManuscriptStatus status) {
// return your predicates
}

...

public static Specification<Manuscript> occupationIds(Set<Long> smallOccupationIds) {
// return your predicates
}

2. Sub Queries

There is a part where I have to select data from another result table, so we have to create a subquery.

 public static Specification<Manuscript> featureIdIn(Set<Long> featureIds) {
return featureIds == null ? null : (root, query, cb) -> {
Subquery<Long> subQuery = query.subquery(Long.class);
Root<PostLabel> subRoot = subQuery.from(PostLabel.class);

subQuery.select(subRoot.get("manuscript").get("id"));
subQuery.groupBy(subRoot.get("manuscript").get("id"));
subQuery.having(cb.greaterThanOrEqualTo(cb.count(subRoot), cb.literal((long) featureIds.size())));
Predicate labelIdPredicate = subRoot.get("label").get("id").in(featureIds);
subQuery.where(labelIdPredicate);

return cb.in(root.get("id")).value(subQuery);
};
}

The PostLabels table represents a many-to-many relationship between the Manuscripts and Labels tables.

public class PostLabel{

@EmbeddedId
private PostLabelEmbeddedID id;

@ManyToOne
@MapsId("manuscriptId")
Manuscript manuscript;

@ManyToOne
@MapsId("labelId")
PostLabel label;
}
public class PostLabelEmbeddedID implements Serializable {

@ManyToOne
Manuscript manuscript;

@ManyToOne
PostLabel label;
}

3. Search LIKE with many keywords

I had a headache trying to find a solution for searching articles that contain one of these keywords in its title, job description, location address, or company description, etc. I had been looking for an answer on how to combine the IN operator with a wildcard character ‘%s%’ but there was no hope for that.

And I found this “REGEXP” operator instead.

REGEXP in MySQL serves as a potent tool for pattern matching in text, capable of matching nearly any pattern within a string.

Here is the format for its usage:

SELECT * FROM manuscripts where title REGEXP '123|新業務|エンジニア|TX, USA';
  • And here is how I implemented it in the Specification class:
public static Specification<Manuscript> keywords(List<String> keywords) {
return keywords == null ? null : (root, query, cb) -> {
Expression<String> searchText = cb.literal(wrapKeywords(keywords));
Predicate[] predicates = {
cb.isTrue(cb.function("regexp_like", Boolean.class, root.get("title"), searchText)),
cb.isTrue(cb.function("regexp_like", Boolean.class, root.get("jobDescription"), searchText)),
cb.isTrue(cb.function("regexp_like", Boolean.class, root.get("location").get("company").get("summary"), searchText)),
cb.isTrue(cb.function("regexp_like", Boolean.class, root.get("location").get("company").get("companyName"), searchText)),
cb.isTrue(cb.function("regexp_like", Boolean.class, root.get("location").get("company").get("introduction"), searchText)),
cb.isTrue(cb.function("regexp_like", Boolean.class, root.get("location").get("address"), searchText))
};
return cb.or(predicates);
};
}
public static String wrapKeywords(List<String> values) {
return values == null ? "" : String.join("|", values);
}

Summary

Thank you for taking the time to read! Below, I have gathered several important points for you.

  1. COALESCE() is your friend: When dealing with potential NULL values in your array parameters, consider using COALESCE(). It can save you from unexpected hiccups in your queries.
  2. Native Query: If you find yourself facing complex SQL scenarios, don’t hesitate to switch to Native Query mode. It might just be the solution you need to tackle those intricate searches.
  3. Criteria API and Specifications: If the SQL has something that can be reused, consider using these tools to make your code cleaner and more maintainable.
  4. Pattern Matching with REGEXP: If you’re dealing with pattern matching in text, and the LIKE operator isn’t providing the assistance you need, remember that REGEXP in MySQL can be your go-to tool.

Hopefully, these tips should help you save your time with the search function. Happy coding, and may your searches always yield the results you seek!

See you! 🙋

--

--