Home > Back-end >  Splitting owned array into owned halves
Splitting owned array into owned halves

Time:01-05

I would like to divide a single owned array into two owned halves—two separate arrays, not slices of the original array. The respective sizes are compile time constants. Is there a way to do that without copying/cloning the elements?

let array: [u8; 4] = [0, 1, 2, 3];

let chunk_0: [u8; 2] = ???;
let chunk_1: [u8; 2] = ???;

assert_eq!(
  [0, 1],
  chunk_0
);
assert_eq!(
  [2, 3],
  chunk_1
);

Since it would amount to merely moving ownership of the elements, I have a hunch there should be a zero-cost abstraction for this. I wonder if I could do something like this with some clever use of transmute and forget. But there are a lot of scary warnings in the docs for those functions.

My main motivation is to operate on large arrays in memory without as many mem copies. For example:

let raw = [0u8; 1024 * 1024];

let a = u128::from_be_array(???); // Take the first 16 bytes
let b = u64::from_le_array(???); // Take the next 8 bytes
let c = ...

The only way I know to accomplish patterns like the above is with lots of mem copying which is redundant.

CodePudding user response:

You can use std::mem:transmute (warning: unsafe!):

fn main() {
    let array: [u8; 4] = [0, 1, 2, 3];

    let [chunk_0, chunk_1]: [[u8; 2]; 2] =
        unsafe { std::mem::transmute::<[u8; 4], [[u8; 2]; 2]>(array) };

    assert_eq!([0, 1], chunk_0);
    assert_eq!([2, 3], chunk_1);
}

Playground

CodePudding user response:

use std::convert::TryInto;

let raw = [0u8; 1024 * 1024];
    
let a = u128::from_be_bytes(raw[..16].try_into().unwrap()); // Take the first 16 bytes
let b = u64::from_le_bytes(raw[16..24].try_into().unwrap()); // Take the next 8 bytes

In practice, I've found the compiler is pretty smart about optimizing this. With optimizations, it will do the above in a single copy (directly into the register that holds a or b, respectively). As an example, according to godbolt, this:

use std::convert::TryInto;

pub fn cvt(bytes: [u8; 24]) -> (u128, u64) {
    let a = u128::from_be_bytes(bytes[..16].try_into().unwrap()); // Take the first 16 bytes
    let b = u64::from_le_bytes(bytes[16..24].try_into().unwrap()); // Take the next 8 bytes
    (a, b)
}

with -C opt-level=3 compiles into:

example::cvt:
        mov     rax, qword ptr [rdi   8]
        bswap   rax
        mov     rdx, qword ptr [rdi]
        bswap   rdx
        mov     rcx, qword ptr [rdi   16]
        ret

It's optimized out any extra copies, calling the try_into method, possibly panicking, et cetera.

CodePudding user response:

The bytemuck library provides a safe wrapper for re-interpretation of any data type that is “plain old data” (more precisely: all possible byte sequences of the right size are valid values), as long as the input and output are the same size (or the input is a slice whose byte-length is divisible by the output type's size). This is equivalent to a transmute solution but without needing to write any any new unsafe code.

let array: [u8; 4] = [0, 1, 2, 3];

let [chunk_0, chunk_1]: [[u8; 2]; 2] = bytemuck::cast(array);

If you'd like to avoid using additional libraries, I recommend the try_into() approach that's already been posted.

CodePudding user response:

You can get a reference to an array using try_into, without a copy/clone.

let chunk_0: &[u8; 2] = (&array[0..1]).try_into().unwrap();
  •  Tags:  
  • Related