Community Detection¶

Find clusters and communities within graphs.

Louvain Algorithm¶

Fast modularity-based community detection.

import grafeo

db = grafeo.GrafeoDB()
algs = db.algorithms()

communities = algs.louvain()
for community_id, members in communities.items():
    print(f"Community {community_id}: {len(members)} members")

Label Propagation¶

Semi-supervised community detection.

algs = db.algorithms()
communities = algs.label_propagation()

Connected Components¶

Find disconnected subgraphs.

algs = db.algorithms()
components = algs.connected_components()

print(f"Found {len(components)} components")
for i, comp in enumerate(components):
    print(f"Component {i}: {len(comp)} nodes")

Strongly Connected Components¶

For directed graphs.

algs = db.algorithms()
sccs = algs.strongly_connected_components()

Weakly Connected Components¶

For directed graphs, ignoring edge direction.

algs = db.algorithms()
wccs = algs.weakly_connected_components()

Triangle Count¶

Count triangles for clustering analysis.

algs = db.algorithms()
triangles = algs.triangles()
print(f"Total triangles: {triangles}")

Stochastic Block Partition¶

Bayesian community detection using the degree-corrected stochastic block model. Minimizes the description length (MDL) of the graph under the generative model, making it more principled than modularity-based methods for graphs with heterogeneous degree distributions.

algs = db.algorithms()

# Auto-detect optimal number of blocks
result = algs.stochastic_block_partition()
print(f"Found {result['num_blocks']} blocks (DL: {result['description_length']:.2f})")

# Target a specific number of blocks
result = algs.stochastic_block_partition(num_blocks=4)

# Incremental update after adding edges (warm start from prior partition)
result2 = algs.stochastic_block_partition(prior_partition=result['partition'])

Part of the GraphChallenge streaming benchmark suite.

Partition Quality Metrics¶

Compare two community assignments (e.g., predicted vs ground truth):

from grafeo.algorithms import metrics

ri  = metrics.rand_index(partition_a, partition_b)
ari = metrics.adjusted_rand_index(partition_a, partition_b)
nmi = metrics.normalized_mutual_information(partition_a, partition_b)

prec = metrics.pairwise_precision(predicted, truth)
rec  = metrics.pairwise_recall(predicted, truth)

Metric	Range	1.0 means
Rand Index	[0, 1]	Identical partitions
Adjusted Rand Index	[-1, 1]	Identical (0 = random)
Normalized Mutual Information	[0, 1]	Identical partitions
Pairwise Precision	[0, 1]	Every predicted co-cluster is true
Pairwise Recall	[0, 1]	Every true co-cluster is predicted

Use Cases¶

Algorithm	Best For
Louvain	Large graphs, quality clusters
Label Propagation	Fast, scalable
Stochastic Block Partition	Heterogeneous graphs, Bayesian inference
Connected Components	Graph structure analysis
Triangle Count	Clustering coefficient