Arrays
Arrays of all of supported types are mapped as Array<T>
where T
is the Rust mapping for the
SQL datatype. For example, a SQL BIGINT[]
is mapped to Array<i64>
, and a TEXT[]
is mapped to
Array<&str>
.
Working with arrays can be slightly cumbersome as Postgres allows NULL as an individual array element. As Rust has
no concept of "null", PL/Rust uses Option<T>
to represent the SQL idea of "I don't have a value".
CREATE FUNCTION sum_array(a INT[]) RETURNS BIGINT STRICT LANGUAGE plrust AS $$
let sum = a.into_iter().map(|i| i.unwrap_or_default() as i64).sum();
Ok(Some(sum))
$$;
# SELECT sum_array(ARRAY[1,2,3]::int[]);
sum_array
-----------
6
Iteration and Slices
Pl/Rust Arrays support slices over the backing Array data if it's an array of a primitive type (i8/16/32/64, f32/64). This can provide drastic performance improvements and even help lead to the Rust compiler autovectorizing code.
Let's examine this using arrays of random FLOAT4
values:
CREATE OR REPLACE FUNCTION random_floats(many int) RETURNS float4[] STRICT PARALLEL SAFE LANGUAGE sql AS $$
SELECT array_agg(random()) FROM generate_series(1, many)
$$;
CREATE TABLE floats AS SELECT random_floats(1000) f FROM generate_series(1, 100000);
Next, we'll sum the array using a function similar to the above:
CREATE OR REPLACE FUNCTION sum_array(a float4[]) RETURNS float4 STRICT LANGUAGE plrust AS $$
let sum = a.into_iter().map(|i| i.unwrap_or_default()).sum();
Ok(Some(sum))
$$;
# explain analyze select sum_array(f) from floats;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on floats (cost=0.00..23161.32 rows=86632 width=4) (actual time=0.064..981.105 rows=100000 loops=1)
Planning Time: 0.037 ms
Execution Time: 983.753 ms
Since in this case we know that the input array won't contain null values, we can optimize slightly. This does a fast "O(1)" check for NULLs when creating the iterator, rather than checking each individual element during iteration:
CREATE OR REPLACE FUNCTION sum_array_no_nulls(a float4[]) RETURNS float4 STRICT LANGUAGE plrust AS $$
let sum = a.iter_deny_null().sum();
Ok(Some(sum))
$$;
explain analyze select sum_array_no_nulls(f) from floats;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on floats (cost=0.00..26637.00 rows=100000 width=4) (actual time=0.055..672.365 rows=100000 loops=1)
Planning Time: 0.035 ms
Execution Time: 676.243 ms
Next, lets take a look at converting the input array into a slice before summing the values. This is particularly fast as it's a true "zero copy" operation:
CREATE OR REPLACE FUNCTION sum_array_slice(a float4[]) RETURNS float4 STRICT LANGUAGE plrust AS $$
let slice = a.as_slice()?; // use the `?` operator as not all `Array<T>`s can be converted into a slice
let sum = slice.iter().sum();
Ok(Some(sum))
$$;
explain analyze select sum_array_slice(f) from floats;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on floats (cost=0.00..26637.00 rows=100000 width=4) (actual time=0.055..478.635 rows=100000 loops=1)
Planning Time: 0.036 ms
Execution Time: 482.344 ms
Finally, lets do some magic to coax the Rust compiler into autovectorizing our "sum_array" function. The code for this comes from, interestingly enough, Stack Overflow: https://stackoverflow.com/questions/23100534/how-to-sum-the-values-in-an-array-slice-or-vec-in-rust/67191480#67191480
CREATE OR REPLACE FUNCTION sum_array_simd(a float4[]) RETURNS float4 STRICT LANGUAGE plrust AS $$
use std::convert::TryInto;
const LANES: usize = 16;
pub fn simd_sum(values: &[f32]) -> f32 {
let chunks = values.chunks_exact(LANES);
let remainder = chunks.remainder();
let sum = chunks.fold([0.0f32; LANES], |mut acc, chunk| {
let chunk: [f32; LANES] = chunk.try_into().unwrap();
for i in 0..LANES {
acc[i] += chunk[i];
}
acc
});
let remainder: f32 = remainder.iter().copied().sum();
let mut reduced = 0.0f32;
for i in 0..LANES {
reduced += sum[i];
}
reduced + remainder
}
let slice = a.as_slice()?;
let sum = simd_sum(slice);
Ok(Some(sum))
$$;
explain analyze select sum_array_simd(f) from floats;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on floats (cost=0.00..26637.00 rows=100000 width=4) (actual time=0.054..413.702 rows=100000 loops=1)
Planning Time: 0.038 ms
Execution Time: 417.237 ms