社区发现SLPA算法

社区（community）定义：同一社区内的节点与节点之间关系紧密，而社区与社区之间的关系稀疏。

设图G=G(V,E),所谓社区发现是指在图G中确定nc(>=1)个社区C={C₁,C₂,...,C_nv},使得各社区的顶点集合构成V的一个覆盖。

若任意两个社区的顶点集合的交际均为空，则称C为非重叠社区（disjoint communities）;否则称为重叠社区（overlapping communities）。

SLPA(Speaker-listener Label Propagation Algorithm)算法是一种社区发现算法，它是对LPA算法（标签传播算法）的拓展。

算法思想如下：

输入参数：迭代次数T，满足社区次数要求的阈值r

输出参数：每一个节点的社区分布

（1）首先，每一个节点的存储器中初始化一个唯一的标签。

（2）然后，重复进行以下步骤，直到达到最大迭代T：

　　a. 选择一个节点作为监听器；

　　b. 所选节点的每个邻居随机选择概率正比于该标签在其存储器中的出现频率的标签，把所选择的标签（speakervote）发送到听众（listener）;

　　c. 监听器增加接收到的最流行的标签到内存。

（3）最后，根据在存储器里的标签和阈值r，后处理被用于输出社区。

 1 public int speakerVote() {
 2         //Run through each element in the map to create a cumulative distribution
 3         Set<Integer> communityIds = communityDistribution.keySet();
 4         ArrayList<Integer> communities = new ArrayList<Integer>();
 5         ArrayList<Integer> cumulativeCounts = new ArrayList<Integer>();
 6         
 7         int sum=-1;
 8         for (Integer comm: communityIds) {
 9             sum += communityDistribution.get(comm);
10             communities.add(comm);
11             cumulativeCounts.add(sum);
12         }
13     
14         //Generate a random integer in the range [0,sum)
15         int rand = RandomNumGenerator.getRandomInt(sum+1);
16         
17         //Find the index of first value greater than rand in cumulativeCounts
18         int i=0;
19         for (i=0; i<cumulativeCounts.size(); i++) {
20             if (cumulativeCounts.get(i)>=rand) 
21                 break;
22         }
23             
24         //Return the corresponding community
25         return communities.get(i);
26     }

SpeakerVote

 1 public void updateLabels(Integer userId){
 2         Set<DefaultWeightedEdge> incomingEdges = userNodegraph.getGraph().incomingEdgesOf(userId);//获取所有该顶点的入度顶点
 3         Map<Integer, Integer> incomingVotes = new HashMap<Integer, Integer>();//所有speaker顶点投票情况
 4         
 5         //For each vertex V with an incoming edge to the current node
 6         for ( DefaultWeightedEdge edge: incomingEdges ) {
 7             int speakerId = userNodegraph.getGraph().getEdgeSource(edge);
 8             UserNode speakerNode = userNodegraph.getNodeMap().get(speakerId);
 9             
10             int votedCommunity = speakerNode.speakerVote();
11             int votedCommunitycount = 1;
12             if ( incomingVotes.containsKey(votedCommunity)){
13                 votedCommunitycount += incomingVotes.get(votedCommunity);
14             } 
15             incomingVotes.put(votedCommunity, votedCommunitycount);
16         }
17         
18         //Find the most popular vote
19         Iterator<Entry<Integer, Integer>> it = incomingVotes.entrySet().iterator();
20         int popularCommunity=-1;
21         int popularCommunityCount=0;
22         while ( it.hasNext()) {
23             Entry<Integer, Integer> entry = it.next();
24             if ( entry.getValue() > popularCommunityCount ) {
25                 popularCommunity = entry.getKey();
26                 popularCommunityCount = entry.getValue();
27             }
28         }
29         //Update community distribution of the current node by 1
30         UserNode currentNode = userNodegraph.getNodeMap().get(userId);
31         currentNode.updateCommunityDistribution(popularCommunity, 1);
32     }