How to Split a String by Space in PostgreSQL Using a Custom Function

Programming & Development / April 18, 2025

How to Split a String by Space in PostgreSQL Using a Custom Function

postgres postgresql split string plpgsql function string_to_array unnest split by space text processing postgres custom function postgres string manipulation plpgsql tutorial

Copy Link Bookmark Print

Article:

When working with text data in PostgreSQL, it's often useful to split a string into individual words—for example, separating user input or processing tags and labels. PostgreSQL provides several built-in functions like string_to_array() and unnest() that make this easy, and you can wrap them into a custom PL/pgSQL function for reuse.

🎯 Goal

We want to create a PostgreSQL function that:

Accepts a single input string
Splits it into words based on spaces
Returns each word as a row in a result set

🛠️ Creating the Function

Here's the function definition using PL/pgSQL:

sql

CREATE OR REPLACE FUNCTION split_string_by_space(input_string TEXT)
RETURNS TABLE(word TEXT) AS $$
BEGIN
    RETURN QUERY
    SELECT unnest(string_to_array(input_string, ' '));
END;
$$ LANGUAGE plpgsql;

🧠 How It Works

string_to_array(input_string, ' '): splits the string into an array using the space character as a delimiter.
unnest(...): expands the array into a set of rows (one per word).
RETURNS TABLE(word TEXT): defines the return type to allow each word to be returned as a separate row.

📥 Example Usage

sql

SELECT * FROM split_string_by_space('This is a test string');

Result:

markdown

 word  
-------
 This
 is
 a
 test
 string
(5 rows)

You can also use it in subqueries or JOINs to analyze or filter text-based data dynamically.

🧩 Bonus Tip: Remove Extra Spaces

If your input might contain multiple spaces, you can normalize whitespace before splitting:

sql

SELECT unnest(string_to_array(regexp_replace('This   is   spaced out', '\s+', ' ', 'g'), ' '));

Or integrate it into your function like this:

sql

CREATE OR REPLACE FUNCTION split_string_clean(input_string TEXT)
RETURNS TABLE(word TEXT) AS $$
BEGIN
    RETURN QUERY
    SELECT unnest(string_to_array(regexp_replace(trim(input_string), '\s+', ' ', 'g'), ' '));
END;
$$ LANGUAGE plpgsql;

NUHMAN.com