Article:
When working with text data in PostgreSQL, it's often useful to split a string into individual words—for example, separating user input or processing tags and labels. PostgreSQL provides several built-in functions like string_to_array()
and unnest()
that make this easy, and you can wrap them into a custom PL/pgSQL function for reuse.
🎯 Goal
We want to create a PostgreSQL function that:
- Accepts a single input string
- Splits it into words based on spaces
- Returns each word as a row in a result set
🛠️ Creating the Function
Here's the function definition using PL/pgSQL:
sql
CREATE OR REPLACE FUNCTION split_string_by_space(input_string TEXT)
RETURNS TABLE(word TEXT) AS $$
BEGIN
RETURN QUERY
SELECT unnest(string_to_array(input_string, ' '));
END;
$$ LANGUAGE plpgsql;
🧠 How It Works
string_to_array(input_string, ' ')
: splits the string into an array using the space character as a delimiter.unnest(...)
: expands the array into a set of rows (one per word).RETURNS TABLE(word TEXT)
: defines the return type to allow each word to be returned as a separate row.
📥 Example Usage
sql
SELECT * FROM split_string_by_space('This is a test string');
Result:
markdown
word
-------
This
is
a
test
string
(5 rows)
You can also use it in subqueries or JOINs to analyze or filter text-based data dynamically.
🧩 Bonus Tip: Remove Extra Spaces
If your input might contain multiple spaces, you can normalize whitespace before splitting:
sql
SELECT unnest(string_to_array(regexp_replace('This is spaced out', '\s+', ' ', 'g'), ' '));
Or integrate it into your function like this:
sql
CREATE OR REPLACE FUNCTION split_string_clean(input_string TEXT)
RETURNS TABLE(word TEXT) AS $$
BEGIN
RETURN QUERY
SELECT unnest(string_to_array(regexp_replace(trim(input_string), '\s+', ' ', 'g'), ' '));
END;
$$ LANGUAGE plpgsql;
✅ Summary
- PostgreSQL makes it easy to split strings using built-in functions.
- Wrapping the logic in a PL/pgSQL function adds reusability and clarity.
- You can enhance this further with cleaning logic or alternate delimiters.