Scalable Community Detection on the Cloud (SCDC)

During the last years, the analysis of complex networks has become a hot topic both in research and industry. Social, biological, information and collaboration networks are typical targets for such analysis, just to cite a few of them. Among all the tools used to analyze these networks, community detection is one of the most relevant. Communities, also known as clusters, are often referred to as vertices with a high density of connections among them and with very few connections with the rest of the graph. Community detection provides valuable information about the structural properties of the network, the interactions among the agents of a network or the role the agents play inside the network.

graph.png

The use of community detection services in the Cloud and as a functionality in Graph Databases is becoming important for companies to discover the bonds between the different entities that they represent through a graph. In addition the use of fast algorithms for such a purpose is important, provided that they have a minimum quality that can be proven not only empirically but also theoretically.

In this TTP, we have focused on productizing the Scalable Community Detection (SCD)1 algorithm that has demonstrated to be the fastest and most accurate Community Detection technique for Social Networks. With the objective to offer this technology to the public and our customers, we have created two implementations of SCD, as shown in the following figure: one through its implementation in the core of Sparksee, Sparsity’s Graph Database (see www.sparsity-technologies.com) and another one through an open service through the web page (see http://www.communitydetectionhub.com/cdh/) which provides a service that can be used interactively through the web page or via a Rest API.

The two implementations are already published, one through the new Sparksee v5.2 and one through the web service provided. The advantages of both implementations are that they are available to the technologists and to the wider public because they can be accessed as technology or as a web service in the cloud.

scd.png

The technologies implemented have been widely announced through the following means:

The business model for the two implementations of SCD is:

  • The implementation in Sparksee is provided as a new functionality at no extra cost to our customers.
  • The cloud implementation is provided as a freemium service. Graphs of 2M nodes or less are analysed for free, larger graphs we charge on a linear cost model with a ceiling cost.
Downloads: 
PDF icon PosterPDF icon Abstract