neo4j cypher
Neo4j的最常见用途之一是构建实时推荐引擎,一个共同的主题是它们利用大量不同的数据来提出有趣的推荐。
例如, 在此视频中, 阿曼达(Amanda)展示了约会网站如何通过社交联系开始,然后介绍热情,位置和其他一些东西,从而构建实时推荐引擎。
Graph Aware有一个简洁的框架 ,可以帮助您使用Java构建自己的推荐引擎,我很好奇Cypher版本的外观。
这是示例图:
CREATE(m:Person:Male {name:'Michal', age:30}),(d:Person:Female {name:'Daniela', age:20}),(v:Person:Male {name:'Vince', age:40}),(a:Person:Male {name:'Adam', age:30}),(l:Person:Female {name:'Luanne', age:25}),(c:Person:Male {name:'Christophe', age:60}),(lon:City {name:'London'}),(mum:City {name:'Mumbai'}),(m)-[:FRIEND_OF]->(d),(m)-[:FRIEND_OF]->(l),(m)-[:FRIEND_OF]->(a),(m)-[:FRIEND_OF]->(v),(d)-[:FRIEND_OF]->(v),(c)-[:FRIEND_OF]->(v),(d)-[:LIVES_IN]->(lon),(v)-[:LIVES_IN]->(lon),(m)-[:LIVES_IN]->(lon),(l)-[:LIVES_IN]->(mum);
我们想向“亚当”推荐一些潜在的朋友,因此我们查询的第一层是找到他的朋友,因为其中肯定有一些潜在的朋友:
MATCH (me:Person {name: "Adam"})
MATCH (me)-[:FRIEND_OF]-()-[:FRIEND_OF]-(potentialFriend)
RETURN me, potentialFriend, COUNT(*) AS friendsInCommon==> +--------------------------------------------------------------------------------------+
==> | me | potentialFriend | friendsInCommon |
==> +--------------------------------------------------------------------------------------+
==> | Node[1007]{name:"Adam",age:30} | Node[1006]{name:"Vince",age:40} | 1 |
==> | Node[1007]{name:"Adam",age:30} | Node[1005]{name:"Daniela",age:20} | 1 |
==> | Node[1007]{name:"Adam",age:30} | Node[1008]{name:"Luanne",age:25} | 1 |
==> +--------------------------------------------------------------------------------------+
==> 3 rows
该查询为我们提供了潜在朋友的列表以及我们有多少个共同的朋友。
现在我们有了一些潜在的朋友,让我们开始为他们每个人建立一个排名。 一个可以吸引潜在朋友的指标是,如果他们和我们生活在同一地点,那么可以将其添加到查询中:
MATCH (me:Person {name: "Adam"})
MATCH (me)-[:FRIEND_OF]-()-[:FRIEND_OF]-(potentialFriend)WITH me, potentialFriend, COUNT(*) AS friendsInCommonRETURN me,potentialFriend,SIZE((potentialFriend)-[:LIVES_IN]->()<-[:LIVES_IN]-(me)) AS sameLocation==> +-----------------------------------------------------------------------------------+
==> | me | potentialFriend | sameLocation |
==> +-----------------------------------------------------------------------------------+
==> | Node[1007]{name:"Adam",age:30} | Node[1006]{name:"Vince",age:40} | 0 |
==> | Node[1007]{name:"Adam",age:30} | Node[1005]{name:"Daniela",age:20} | 0 |
==> | Node[1007]{name:"Adam",age:30} | Node[1008]{name:"Luanne",age:25} | 0 |
==> +-----------------------------------------------------------------------------------+
==> 3 rows
接下来,我们将通过比较每个节点的标签来检查Adams的潜在朋友是否与他具有相同的性别。 我们提供了“性别”和“性别”标签。
MATCH (me:Person {name: "Adam"})
MATCH (me)-[:FRIEND_OF]-()-[:FRIEND_OF]-(potentialFriend)WITH me, potentialFriend, COUNT(*) AS friendsInCommonRETURN me,potentialFriend,SIZE((potentialFriend)-[:LIVES_IN]->()<-[:LIVES_IN]-(me)) AS sameLocation,LABELS(me) = LABELS(potentialFriend) AS gender==> +--------------------------------------------------------------------------------------------+
==> | me | potentialFriend | sameLocation | gender |
==> +--------------------------------------------------------------------------------------------+
==> | Node[1007]{name:"Adam",age:30} | Node[1006]{name:"Vince",age:40} | 0 | true |
==> | Node[1007]{name:"Adam",age:30} | Node[1005]{name:"Daniela",age:20} | 0 | false |
==> | Node[1007]{name:"Adam",age:30} | Node[1008]{name:"Luanne",age:25} | 0 | false |
==> +--------------------------------------------------------------------------------------------+
==> 3 rows
接下来,让我们计算亚当和他的潜在朋友之间的年龄差异:
MATCH (me:Person {name: "Adam"})
MATCH (me)-[:FRIEND_OF]-()-[:FRIEND_OF]-(potentialFriend)WITH me, potentialFriend, COUNT(*) AS friendsInCommonRETURN me,potentialFriend,SIZE((potentialFriend)-[:LIVES_IN]->()<-[:LIVES_IN]-(me)) AS sameLocation,abs( me.age - potentialFriend.age) AS ageDifference,LABELS(me) = LABELS(potentialFriend) AS gender,friendsInCommon==> +--------------------------------------------------------------------------------------+
==> | me | potentialFriend | sameLocation | ageDifference | gender | friendsInCommon |
==> +--------------------------------------------------------------------------------------+
==> | Node[1007]{name:"Adam",age:30} | Node[1006]{name:"Vince",age:40} | 0 | 10.0 | true | 1 |
==> | Node[1007]{name:"Adam",age:30} | Node[1005]{name:"Daniela",age:20} | 0 | 10.0 | false | 1 |
==> | Node[1007]{name:"Adam",age:30} | Node[1008]{name:"Luanne",age:25} | 0 | 5.0 | false | 1 |
==> +--------------------------------------------------------------------------------------+
==> 3 rows
现在,让我们进行一些过滤,以摆脱与亚当已经成为朋友的人–推荐这些人没有多大意义!
MATCH (me:Person {name: "Adam"})
MATCH (me)-[:FRIEND_OF]-()-[:FRIEND_OF]-(potentialFriend)WITH me, potentialFriend, COUNT(*) AS friendsInCommonWITH me,potentialFriend,SIZE((potentialFriend)-[:LIVES_IN]->()<-[:LIVES_IN]-(me)) AS sameLocation,abs( me.age - potentialFriend.age) AS ageDifference,LABELS(me) = LABELS(potentialFriend) AS gender,friendsInCommonWHERE NOT (me)-[:FRIEND_OF]-(potentialFriend)RETURN me,potentialFriend,SIZE((potentialFriend)-[:LIVES_IN]->()<-[:LIVES_IN]-(me)) AS sameLocation,abs( me.age - potentialFriend.age) AS ageDifference,LABELS(me) = LABELS(potentialFriend) AS gender,friendsInCommon==> +---------------------------------------------------------------------------------------+
==> | me | potentialFriend | sameLocation | ageDifference | gender | friendsInCommon |
==> +---------------------------------------------------------------------------------------+
==> | Node[1007]{name:"Adam",age:30} | Node[1006]{name:"Vince",age:40} | 0 | 10.0 | true | 1 |
==> | Node[1007]{name:"Adam",age:30} | Node[1005]{name:"Daniela",age:20} | 0 | 10.0 | false | 1 |
==> | Node[1007]{name:"Adam",age:30} | Node[1008]{name:"Luanne",age:25} | 0 | 5.0 | false | 1 |
==> +---------------------------------------------------------------------------------------+
==> 3 rows
在这种情况下,我们实际上并未将任何人过滤掉,但是对于其他一些人,我们会看到潜在朋友数量的减少。
我们的最后一步是为每个我们认为对提出朋友建议很重要的功能评分。
如果人们居住在与亚当相同的地方或性别相同,我们将给满分10分,否则给0分。 对于ageDifference和friendsInCommon,我们将应用对数曲线,以使这些值不会对我们的最终分数产生不成比例的影响。 我们将使用ParetoScoreTransfomer中定义的公式来执行此操作:
public <OUT> float transform(OUT item, float score) {if (score < minimumThreshold) {return 0;}double alpha = Math.log((double) 5) / eightyPercentLevel;double exp = Math.exp(-alpha * score);return new Double(maxScore * (1 - exp)).floatValue();}
现在,对于我们完整的推荐查询:
MATCH (me:Person {name: "Adam"})
MATCH (me)-[:FRIEND_OF]-()-[:FRIEND_OF]-(potentialFriend)WITH me, potentialFriend, COUNT(*) AS friendsInCommonWITH me,potentialFriend,SIZE((potentialFriend)-[:LIVES_IN]->()<-[:LIVES_IN]-(me)) AS sameLocation,abs( me.age - potentialFriend.age) AS ageDifference,LABELS(me) = LABELS(potentialFriend) AS gender,friendsInCommonWHERE NOT (me)-[:FRIEND_OF]-(potentialFriend)WITH potentialFriend,// 100 -> maxScore, 10 -> eightyPercentLevel, friendsInCommon -> score (from the formula above)100 * (1 - exp((-1.0 * (log(5.0) / 10)) * friendsInCommon)) AS friendsInCommon,sameLocation * 10 AS sameLocation,-1 * (10 * (1 - exp((-1.0 * (log(5.0) / 20)) * ageDifference))) AS ageDifference,CASE WHEN gender THEN 10 ELSE 0 END as sameGenderRETURN potentialFriend,{friendsInCommon: friendsInCommon,sameLocation: sameLocation,ageDifference:ageDifference,sameGender: sameGender} AS parts,friendsInCommon + sameLocation + ageDifference + sameGender AS score
ORDER BY score DESC==> +---------------------------------------------------------------------------------------+
==> | potentialFriend | parts | score |
==> +---------------------------------------------------------------------------------------+
==> | Node[1006]{name:"Vince",age:40} | {friendsInCommon -> 14.86600774792154, sameLocation -> 0, ageDifference -> -5.52786404500042, sameGender -> 10} | 19.33814370292112 |
==> | Node[1008]{name:"Luanne",age:25} | {friendsInCommon -> 14.86600774792154, sameLocation -> 0, ageDifference -> -3.312596950235779, sameGender -> 0} | 11.55341079768576 |
==> | Node[1005]{name:"Daniela",age:20} | {friendsInCommon -> 14.86600774792154, sameLocation -> 0, ageDifference -> -5.52786404500042, sameGender -> 0} | 9.33814370292112 |
==> +----------------------------------------------------------------------------------------+
最终查询还不错-唯一真正复杂的部分是对数曲线计算。 用户定义的功能将在将来发挥作用。
这种方法的好处是我们不必走出密码的道路,因此,如果您对Java不满意,仍然可以进行实时建议! 另一方面,推荐引擎的不同部分混合在一起,因此要查看整个管道并不像使用图形感知框架那样容易。
下一步是将其应用于Twitter图形,并在此提供关注者建议。
翻译自: https://www.javacodegeeks.com/2015/03/neo4j-generating-real-time-recommendations-with-cypher.html
neo4j cypher