\(\)

Good news! In an earlier post, I talked about my flexible implementation of the Louvain algorithm in Python. Now, it is also available in a C++ implementation, making it run much (really much) faster. Whereas the previous version already began to stagger for a thousand nodes, this implementation can easily be applied to graphs of millions of nodes (although you still might need to be patient when your graph is really that big). Even better, you can now simply install it using pip: `sudo pip install louvain`

. For Windows users, you can refer to the binary installer at the PyPi repository. The source code is also available from GitHub.

Using it is really straightforward. To start, make sure to import the packages:

import louvain import igraph as ig

We’ll create a random graph for testing purposes:

G = ig.Graph.Erdos_Renyi(100, 0.1);

For simply finding a partition use:

part = louvain.find_partition(G, method='Modularity');

Notice that `part`

now contains an additional variable, `part.quality`

which stores the quality of the partition as calculated by the used method. You can always get the quality of the partition using another method by calling

part.significance = louvain.quality(G, partition, method='Significance');

You can also find partitions for multiplex graphs. For each layer you then specify the objective function, and the overall objective function is simply the sum over all layers, weighted by some weight. If we denote by \(q_k\) the quality of layer \(k\) and the weight by \(w_k\), the overall quality is then \(q = \sum_k w_k q_k\). This can also be useful in case you have negative links . In principle, this could also be used to detect temporal communities in a dynamic setting .

For example, assuming you have a graph with positive weights `G_positive`

and a graph with negative weights `G_negative`

, and you want to use Modularity for finding a partition, you can use

membership, quality = louvain.find_partition_multiplex([ louvain.Layer(graph=G_positive, method='Modularity', layer_weight=1.0), louvain.Layer(graph=G_negative, method='Modularity', layer_weight=-1.0)])

Notice the negative layer weight is `-1.0`

for the negative graph, since we want those edges to fall between communities rather than within. One particular problem when using negative links, is that the optimal community is no longer guaranteed to be connected (it may be a multipartite partition). You may therefore need the option `consider_comms=ALL_COMMS`

to improve the quality of the partition. Notice that this runs much slower than only considering neighbouring communities (which is the default).

Various methods, such as the Potts model from , or CPM support a (linear) resolution parameter, which can be effectively bisected . You can do this by calling:

res_parts = louvain.bisect(G, method='CPM', resolution_range=[0,1]);

Notice this may take some time to run, as it effectively calls louvain.find_partition for various resolution parameters (depending on the

settings possibly hundreds of times).

Then `res_parts`

is a dictionary containing as keys the resolution, and as values a `NamedTuple`

with variables `partition`

and `bisect_value`

, which contains the partition and the value at which the resolution was bisected (the value of the `bisect_func`

of the `bisect`

function). You could for example plot the bisection value of all the found partitions by using:

import pandas as pd import matplotlib.pyplot as plt res_df = pd.DataFrame({ 'resolution': res_parts.keys(), 'bisect_value': [bisect.bisect_value for bisect in res_parts.values()]}); plt.step(res_df['resolution'], res_df['bisect_value']); plt.xscale('log');

The result should look something like this for a random graph.

*Science (New York, N.Y.)*328 (5980): 876–78. https://doi.org/10.1126/science.1184819. Cite

*Physical Review E*74 (1): 016110+. https://doi.org/10.1103/PhysRevE.74.016110. Cite

*Physical Review E*80 (3): 036115. https://doi.org/10.1103/PhysRevE.80.036115. Cite

*Scientific Reports*3: 2930. https://doi.org/10.1038/srep02930. Cite

*Physical Review E*84 (1): 016114. https://doi.org/10.1103/PhysRevE.84.016114. Cite

Thanks! I have two questions: 1) can you give a hint about you build/extend the set of methods? 2) for the Newman modularity metric, it appears to be a scaled version that is returned?

You’re welcome!

1) it should be easy to implement your own extensions. What method would you like to see implemented?

2) yes, indeed, the non-normalised version is returned, perhaps that should be made clearer in the documentation. That is, the value is not divided by the number of edges \(m\) (or \(2m\)).

Super. My intention was to experiment with some non-standard methods (for software modularization), so any quick pointers to how to build locally to experiment with such methods would be great.

It seems I didn’t respond as a reply to you below, so I’m not sure if you got my answer, but here’s a ping to let you know, just in case.

Easiest would be to fork the project on GitHub, and use that Git repository. That way, we could also easily reintegrate any contribution you make. You need to derive from the base class

`MutableVertexPartition`

and implement the methods`quality`

and`diff_move`

. They provide respectively the quality of the partition \(\mathcal{H}\) and the difference in the quality when moving a node (which is a community in the aggregated network) \(\Delta \mathcal{H}\). You can take a look at the implementation of`ModularityVertexPartition`

for an example.The calculation of \(\Delta \mathcal{H}\) is the only real challenge: make sure you calculate this correctly. When putting

`#define DEBUG`

in`GraphHelper.h`

(which is included everywhere) both the difference using`diff_move`

and the difference using`quality`

before and after actually moving a node is calculated (in`Optimiser.cpp:400-414`

). You can use that to verify whether at least both expressions are consistent. Whether they are consistently wrong or consistently correct is another matter of course, but they are at least consistent. 😉Finally, compiling and installing is simple. I recommend to use

`python setup.py install --record files.txt`

so that you can use`cat files.txt | xargs rm`

to remove the installed files easily again (possibly both need to be run as root). This both compiles and installs the python package in the appropriate location. You probably want to uninstall the PyPi package first in order to prevent problems. This is assuming you run a Unix variant. If you are using Windows, I suggest you switch to a decent Unix variant 😉Good luck! Let me know how it works out! And if you have any other questions, don’t hesitate to reply here, or just mail me (or open an issue on GitHub).