PostgreSQL Database supports Full Text Search (FTS)
Full Text Search (or just text search, or just FTS) in databases (as in PostgreSQL) provides the capability to identify natural-language documents that satisfy a query, and optionally to sort them by relevance to the query. PostgreSQL have a good FTS support since version 9 with many improvements in the last versions: 11 and 12. This article will show you how to configure and use the FTS features on PostgreSQL database.
An introduction to Full-Text Search
(...) full-text search refers to techniques for searching a single computer-stored document or a collection in a full text database; (...) distinguished from searches based on metadata or on parts of the original texts represented in databases (such as titles, abstracts, selected sections, or bibliographical references)
Thus, the Full-Text Search offers accurate search results. A Full-Text Search query allows you to search for words inside text data.
Full-Text Search using PostgreSQL
PostgreSQL offers a full featured Full-Text Search
FTSengine since version 9.6. In later versions some improvements has been added to the FTS engine.
Full-Text Search Initialization in PostgreSQL
To start using the Full-Text Search within a PostgreSQL database it is required to initialize a
FTSdictionary for each language that will be used to search. In this example
englishfor which the
english_stem_nostop(english, no stop words) dictionary available in PostgreSQL will be used. The following queries need to be run over the target database that will be used for searches.
-- initialize the FTS english_stem_nostop within the target database CREATE TEXT SEARCH DICTIONARY public.english_stem_nostop(Template = snowball, Language = english); -- create the FTS search configuration within the target database CREATE TEXT SEARCH CONFIGURATION public.english_nostop ( COPY = pg_catalog.english ); -- add some basic parsers to the FTS search configuration within the target database ALTER TEXT SEARCH CONFIGURATION public.english_nostop ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, hword, hword_part, word WITH english_stem_nostop;
Full-Text Search Setup example, for table my_table
Once the above initialization has been completed it is the time to create a FTS dedicated field for searches. It can be done without, by only using an index, but in this example we will create a separate special FTS field in order to be able to visualize the FTS data.
-- 1. the original table (before adding FTS) will be modified as in example below CREATE TABLE my_table ( id character varying(10) NOT NULL, description text NOT NULL, CONSTRAINT my_table__chk__id CHECK ((char_length((id)::text) = 10)) ); -- 2. we will modify the table from above by creating an extra FTS index for the description field to be used on FTS searches CREATE INDEX my_table__idx__fts_field ON my_table USING gin (to_tsvector('public.english_nostop'::regconfig, description));
In the variant 1. we can only do approximative searches using
ILIKEover the text which cannot be sure it covers full words or even native english words only.
-- approximative search (case sensitive) SELECT * FROM my_table WHERE description LIKE '%example%'; -- approximative search (case insensitive) SELECT * FROM my_table WHERE description ILIKE '%example%';
In the variant 2. we can use the
FTS(Full-Text Search) way of searching results to deliver realistic matches and accurate search results ; because the above setup was using the No-Stop dictionary the stop words will be ignored.
-- basic: searching the word 'example' -- will deliver better results than LIKE or ILIKE SELECT * FROM my_table WHERE to_tsvector('english_nostop', "description") @@ plainto_tsquery('english_nostop', 'example'); -- advanced: searching the word 'example' and return first the results that have higher matching scores -- will deliver even better results than LIKE or ILIKE or FTS above SELECT * FROM my_table WHERE to_tsvector('english_nostop', "description") @@ plainto_tsquery('english_nostop', 'example') ORDER BY ts_rank_cd(to_tsvector('english_nostop', "description"), plainto_tsquery('english_nostop', 'example')) DESC;
FTSFull-Text Search delivers faster and accurate results comparing with
ILIKEbecause each valid english word available in the target field is already indexed.
to_tsvectormethod is used togehter with
plainto_tsqueryas in the above examples and sometimes with
ts_rank_cdto deliver the optimized results. Instead
plainto_tsquerywhich takes the text as is, plain, the
phraseto_tsquerycan be used which are not explained here (see the PostgreSQL documentation).
- The stop words such as and, the (and similar) will be ignored when searching phrases using
FTSis the recommended way to do searches whenever is possible
- There are also limitations for using
- not all languages have
FTSdictionaries available with PostgreSQL ...
- number of lexemes must be less than 264 ; no more than 256 positions per lexeme ; position values in tsvector must be greater than 0 and no more than 16383 and some others
- not all languages have