Full Text Search
How to use full text search in PostgreSQL.
Postgres has built-in functions to handle
Full Text Search queries. This is like a "search engine" within Postgres.
For this guide we'll use the following example data:
|1||The Poky Little Puppy||Janette Sebring Lowrey||Puppy is slower than other, bigger animals.|
|2||The Tale of Peter Rabbit||Beatrix Potter||Rabbit eats some vegetables.|
|3||Tootle||Gertrude Crampton||Little toy train has big dreams.|
|4||Green Eggs and Ham||Dr. Seuss||Sam has changing food preferences and eats unusually colored food.|
|5||Harry Potter and the Goblet of Fire||J.K. Rowling||Fourth year of school starts, big drama ensues.|
The functions we'll cover in this guide are:
Converts your data into searchable "tokens".
to_tsvector() stands for "to text search vector". For example:
_10select to_tsvector('green eggs and ham');_10-- Returns 'egg':2 'green':1 'ham':4
Collectively these tokens are called a "document" which Postgres can use for comparisons.
Converts a query string into "tokens" to match.
to_tsquery() stands for "to text search query".
This conversion step is important because we will want to "fuzzy match" on keywords. For example if a user searches for "eggs", and a column has the value "egg", we probably still want to return a match.
@@ symbol is the "match" symbol for Full Text Search. It returns any matches between a
to_tsvector result and a
Take the following example:
_10select *_10from books_10where title = 'Harry';
The equality symbol above (
=) is very "strict" on what it matches. In a full text search context, we might want to find all "Harry Potter" books and so we can rewrite the
_10select *_10from books_10where to_tsvector(title) @@ to_tsquery('Harry');
Basic Full Text Queries#
Search a single column#
To find all
books where the
description contain the word
Search multiple columns#
To find all
title contain the word
_10select_10*_10from_10books_10where_10to_tsvector(description || ' ' || title) -- concat columns, but be sure to include a space to separate them!_10@@ to_tsquery('little');
Match all search words#
To find all
description contains BOTH of the words
big, we can use the
_10select_10*_10from_10books_10where_10to_tsvector(description)_10@@ to_tsquery('little & big'); -- use & for AND in the search query
Match any search words#
To find all
description contain ANY of the words
big, use the
_10select_10*_10from_10books_10where_10to_tsvector(description)_10@@ to_tsquery('little | big'); -- use | for OR in the search query
Notice how searching for
big includes results with the word
Now that we have Full Text Search working, let's create an
index. This will allow Postgres to "build" the documents pre-emptively so that they
don't need to be created at the time we execute the query. This will make our queries much faster.
Let's create a new column
fts inside the
books table to store the searchable index of the
We can use a special feature of Postgres called
to ensure that the index is updated any time the values in the
description columns change.
_10alter table_10books_10add column_10fts tsvector generated always as (to_tsvector('english', description || ' ' || title)) stored;_10_10create index books_fts on books using gin (fts); -- generate the index_10_10select id, fts_10from books;
Search using the new column#
Now that we've created and populated our index, we can search it using the same techniques as before:
_10select_10*_10from_10books_10where_10fts @@ to_tsquery('little & big');
Visit PostgreSQL: Text Search Functions and Operators
to learn about additional query operators you can use to do more advanced
full text queries, such as:
The proximity symbol is useful for searching for terms that are a certain "distance" apart.
For example, to find the phrase
big dreams, where the a match for "big" is followed immediately by a match for "dreams":
_10select_10*_10from_10books_10where_10to_tsvector(description) @@ to_tsquery('big <-> dreams');
We can also use the
<-> to find words within a certain distance of eachother. For example to find
school within 2 words of each other:
_10select_10*_10from_10books_10where_10to_tsvector(description) @@ to_tsquery('year <2> school');
The negation symbol can be used to find phrases which don't contain a search term.
For example, to find records that have the word
big but not
_10select_10*_10from_10books_10where_10to_tsvector(description) @@ to_tsquery('big & !little');