I'm sure I've done this before and pretty sure I've also posted it here before, but I couldn't find it.
Suppose you have to 3d-arrays:
A an M by S by K array formed for example by A = cat(3,A1,A2,A3, ... ,AK)
B an S by N by K array formed for example by B = cat(3,B1,B2,B3, ... ,BK)
and you'd like to compute a new 3d array C such that
C an M by N by K array as if formed by C = cat(3, A1 * B1, A2 * B2, ..., AK * BK)
where X*Y is the usual 2d matrix multiply. In my case, K >> M,S,N.
In matlab, a first attempt might be to write a single for loop over the last dimension of size K:
C = zeros(m,n,k);
for k = 1:K
C(:,:,k) = A(:,:,k) * B(:,:,k);
end
Matlab's notorious for loops have gotten better in the last couple years, but this is still slow.
Better is to unroll the 2D loop and take advantage of vectorized, elementwise vector-vector multiplication:
C = zeros(m,n,k);
for i = 1:M
for j = 1:S
for k = 1:N
C(i,k,:) = C(i,k,:) + A(i,j,:).*B(j,k,:);
end
end
end
Slightly better still is to use bsxfun
to compute vectorize outer-products in a single for loop
C = zeros(m,n,k);
for j = 1:size(A,2)
C = C + bsxfun(@times,A(:,j,:),B(j,:,:));
end
This solution works especially well if S is small compared to the other dimensions.
For (M=S=N=3, K=10000000), these take 60 secs, 13 secs, and 9 secs respectively.