Skip to content

Community Detection

Find clusters and communities within graphs.

Louvain Algorithm

Fast modularity-based community detection.

import grafeo

db = grafeo.GrafeoDB()
algs = db.algorithms()

communities = algs.louvain()
for community_id, members in communities.items():
    print(f"Community {community_id}: {len(members)} members")

Label Propagation

Semi-supervised community detection.

algs = db.algorithms()
communities = algs.label_propagation()

Connected Components

Find disconnected subgraphs.

algs = db.algorithms()
components = algs.connected_components()

print(f"Found {len(components)} components")
for i, comp in enumerate(components):
    print(f"Component {i}: {len(comp)} nodes")

Strongly Connected Components

For directed graphs.

algs = db.algorithms()
sccs = algs.strongly_connected_components()

Weakly Connected Components

For directed graphs, ignoring edge direction.

algs = db.algorithms()
wccs = algs.weakly_connected_components()

Triangle Count

Count triangles for clustering analysis.

algs = db.algorithms()
triangles = algs.triangles()
print(f"Total triangles: {triangles}")

Stochastic Block Partition

Bayesian community detection using the degree-corrected stochastic block model. Minimizes the description length (MDL) of the graph under the generative model, making it more principled than modularity-based methods for graphs with heterogeneous degree distributions.

algs = db.algorithms()

# Auto-detect optimal number of blocks
result = algs.stochastic_block_partition()
print(f"Found {result['num_blocks']} blocks (DL: {result['description_length']:.2f})")

# Target a specific number of blocks
result = algs.stochastic_block_partition(num_blocks=4)

# Incremental update after adding edges (warm start from prior partition)
result2 = algs.stochastic_block_partition(prior_partition=result['partition'])

Part of the GraphChallenge streaming benchmark suite.

Partition Quality Metrics

Compare two community assignments (e.g., predicted vs ground truth):

from grafeo.algorithms import metrics

ri  = metrics.rand_index(partition_a, partition_b)
ari = metrics.adjusted_rand_index(partition_a, partition_b)
nmi = metrics.normalized_mutual_information(partition_a, partition_b)

prec = metrics.pairwise_precision(predicted, truth)
rec  = metrics.pairwise_recall(predicted, truth)
Metric Range 1.0 means
Rand Index [0, 1] Identical partitions
Adjusted Rand Index [-1, 1] Identical (0 = random)
Normalized Mutual Information [0, 1] Identical partitions
Pairwise Precision [0, 1] Every predicted co-cluster is true
Pairwise Recall [0, 1] Every true co-cluster is predicted

Use Cases

Algorithm Best For
Louvain Large graphs, quality clusters
Label Propagation Fast, scalable
Stochastic Block Partition Heterogeneous graphs, Bayesian inference
Connected Components Graph structure analysis
Triangle Count Clustering coefficient