One of my colleagues could not run my matlab script because it called graphconncomp
. This function computes the connected components of a graph represented by an adjacency matrix. Strangely it is not a built-in function of the standard MATLAB installation. Rather it's included in the bioinformatics toolbox, which my colleague's university did not purchase for him. This was frustrating so I wrote my own drop in replacement. Find this in my gptoolbox:
function [S,C] = conncomp(G)
% CONNCOMP Drop in replacement for graphconncomp.m from the bioinformatics
% toobox. G is an n by n adjacency matrix, then this identifies the S
% connected components C. This is also an order of magnitude faster.
%
% [S,C] = conncomp(G)
%
% Inputs:
% G n by n adjacency matrix
% Outputs:
% S scalar number of connected components
% C
[p,q,r] = dmperm(G+speye(size(G)));
S = numel(r)-1;
C = cumsum(full(sparse(1,r(1:end-1),1,1,size(G,1))));
C(p) = C;
end
As a pleasant surprise, my implementation is also an order of magnitude faster than graphconncomp
. On a 4,000,000 by 4,000,000 sparse adjacency matrix, graphconncomp
takes 12 seconds and my conncomp
using dmperm
takes 2 seconds.