php amazon-s3_推荐亚马逊电影-一种协作方法

php amazon-s3

Item-based collaborative and User-based collaborative approach for recommendation system with simple coding.

推荐系统的基于项目的协作和基于用户的协作方法,编码简单。

推荐系统概述 (Overview of Recommendation System)

There are many methods of recommendation system where each of them serve for different purposes. My previous article is talking about the simple and content-based recommendation. These recommendations are non-personalised recommenders, but that doesn’t mean they are less useful when compare to the other. These method are very popular for recommending top music of the week and recommending music of similar genre.

推荐系统的方法很多,每种方法都有不同的用途。 我的上一篇文章讨论的是基于内容的简单推荐。 这些推荐是非个性化的推荐者,但这并不意味着它们与其他推荐相比没有太大用处。 这些方法在推荐本周热门音乐和推荐类似流派的音乐时非常流行。

In this article, it will focus on collaborative filtering method. This method considers your taste in comparison to people/items that are in similar. Then, it recommends a list of items based on consumption similarity and suggest what you probably interested. These method only focus on calculating the rating.

在本文中,它将重点介绍协作过滤方法。 与相似的人/物品相比,此方法考虑了您的口味。 然后,它根据消费相似性推荐商品清单,并建议您可能感兴趣的商品。 这些方法仅专注于计算等级

There are two main filtering for this method: item-based filtering and user-based filtering. Item-based filtering will suggest items that are similar to what you have already liked. User-based filtering will suggest items that people similar to you have liked but you have not yet consumed.

此方法主要有两种过滤:基于项目的过滤和基于用户的过滤。 基于项目的过滤将建议与您喜欢的项目相似的项目。 基于用户的过滤将建议与您相似的人喜欢但尚未消耗的物品。

With the Amazon movie data, we will apply item-based filtering and user-based filtering recommendation methods to analyze similar items to be recommend and identify users that have similar taste.

借助Amazon电影数据 ,我们将应用基于项目的过滤和基于用户的过滤推荐方法来分析要推荐的相似项目并识别具有相似品味的用户。

分析概述 (Analysis Overview)

For both item-based filtering and user-based filtering recommendation, we need to clean data and prepare them into matrix so that it can be used for analysis. All ratings need to be in numbers and normalized and cosine similarity will be used to calculate items/users similarity.

对于基于项目的过滤和基于用户的过滤建议,我们都需要清理数据并将它们准备成矩阵,以便可以将其用于分析。 所有等级都必须以数字表示并进行归一化,余弦相似度将用于计算项目/用户相似度。

资料总览 (Data Overview)

There are 4,848 users with a total of 206 movies in the dataset.

数据集中有4848位用户,总共206部电影。

实作 (Implementation)

Now, lets import all tools that we are going to use for the analysis, put data into DataFrame, and clean them.

现在,让我们导入我们将用于分析的所有工具,将数据放入DataFrame并清理它们。

import pandas as pd
import numpy as np
import scipy as sp
from scipy import sparse
from sklearn.metrics.pairwise import cosine_similarityamazon = pd.read_csv('.../Amazon.csv')
amazon.info()
amazon.head()
Image for post
Image for post

Then, we need to rearrange data into matrix format where we will set index for the rows as user_id and index for the column as name.

然后,我们需要将数据重新排列为矩阵格式,在该格式中,将行的索引设置为user_id,将列的索引设置为name。

amazon = amazon.melt(id_vars=['user_id'], var_name='name', value_name='rating')
amazon_pivot = amazon.pivot_table(index=['user_id'], columns=['name'], values='rating')
amazon_pivot.head()
Image for post

From here, we need to normalized the rating values so that value range are closer to one and another. Then, turn the NaN values into 0 and select only those users who at least rate one movie.

从这里开始,我们需要对评级值进行归一化,以使值范围彼此接近。 然后,将NaN值设置为0,然后仅选择至少对一部电影评分的用户。

amazon_normalized = amazon_pivot.apply(lambda x: (x-np.min(x))/(np.max(x)-np.min(x)), axis=1)amazon_normalized.fillna(0, inplace=True)
amazon_normalized = amazon_normalized.T
amazon_normalized = amazon_normalized.loc[:, (amazon_normalized !=0).any(axis=0)]
Image for post

We nearly there. Now we need to put them into sparse matrix.

我们快到了。 现在我们需要将它们放入稀疏矩阵。

amazon_sparse = sp.sparse.csr_matrix(amazon_normalized.values)

Lets look at item-based filtering recommendation.

让我们看一下基于项目的过滤建议

item_similarity = cosine_similarity(amazon_sparse)
item_sim_df = pd.DataFrame(item_similarity, index=amazon_normalized.index, columns=amazon_normalized.index)
item_sim_df.head()
Image for post

All the columns and rows are now become each of the movie and it is ready for the recommendation calculation.

现在,所有的列和行都成为电影的每一个,并且可以进行推荐计算了。

def top_movie(movie_name):
for item in item_sim_df.sort_values(by = movie_name, ascending = False).index[1:11]:
print('Similar movie:{}'.format(item))top_movie("Movie102")
Image for post

These are the movies that are similar to Movie102.

这些是与Movie102类似的电影。

Lets look at user-based filtering recommendation. Who has similar taste to me?

让我们看一下基于用户的过滤推荐 。 谁有和我相似的品味?

user_similarity = cosine_similarity(amazon_sparse.T)
user_sim_df = pd.DataFrame(user_similarity, index = amazon_normalized.columns, columns = amazon_normalized.columns)
user_sim_df.head()
Image for post
def top_users(user):  
sim_values = user_sim_df.sort_values(by=user, ascending=False).loc[:,user].tolist()[1:11]
sim_users = user_sim_df.sort_values(by=user, ascending=False).index[1:11]
zipped = zip(sim_users, sim_values)
for user, sim in zipped:
print('User #{0}, Similarity value: {1:.2f}'.format(user, sim))top_users('A140XH16IKR4B0')
Image for post

These are the examples on how to implement the item-based and user-based filtering recommendation system. Some of the code are from https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data

这些是有关如何实施基于项目和基于用户的过滤推荐系统的示例。 一些代码来自https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data

Hope that you enjoy!

希望你喜欢!

翻译自: https://medium.com/analytics-vidhya/recommend-amazon-movie-a-collaborative-approach-9b3db8f48ad6

php amazon-s3

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391226.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

[高精度乘法]BZOJ 1754 [Usaco2005 qua]Bull Math

模板题目&#xff0c;练练手~ #include <iostream> #include <algorithm> #include <cstring> #include <cstdio> using namespace std;int s1[2333]; int s2[2333]; int Out[2333]; string one,two;void Debug(){for(int i0;i<one.length();i){pri…

python:使用Djangorestframework编写post和get接口

1、安装django pip install django 2、新建一个django工程 python manage.py startproject cainiao_monitor_api 3、新建一个app python manage.py startapp monitor 4、安装DRF pip install djangorestframework 5、编写视图函数 views.py from rest_framework.views import A…

Kubernetes 入门(3)集群安装

1. kubeadm简介 kubeadm 是 Kubernetes 官方提供的一个 CLI 工具&#xff0c;可以很方便的搭建一套符合官方最佳实践的最小化可用集群。当我们使用 kubeadm 搭建集群时&#xff0c;集群可以通过 K8S 的一致性测试&#xff0c;并且 kubeadm 还支持其他的集群生命周期功能&#…

Angular Material 攻略 04 Icon

Icon 网页系统中的Icon虽然说很简单&#xff0c;但是其中的学问还是有很多的&#xff0c;我们常用的Icon库有FontAwesome、Iconfont等&#xff0c;我们选择了Angular Material这个组件库&#xff0c;就介绍Material Icons吧。 对Icon感兴趣的同学可以看一下这里 Material Desig…

【9303】平面分割

Time Limit: 10 second Memory Limit: 2 MB 问题描述 同一平面内有n&#xff08;n≤500&#xff09;条直线&#xff0c;已知其中p&#xff08;p≥2&#xff09;条直线相交与同一点&#xff0c;则这n条直线最多能将平面分割成多少个不同的区域&#xff1f; Input 两个整数n&am…

简述yolo1-yolo3_使用YOLO框架进行对象检测的综合指南-第一部分

简述yolo1-yolo3重点 (Top highlight)目录&#xff1a; (Table Of Contents:) Introduction 介绍 Why YOLO? 为什么选择YOLO&#xff1f; How does it work? 它是如何工作的&#xff1f; Intersection over Union (IoU) 联合路口(IoU) Non-max suppression 非最大抑制 Networ…

django:资源网站汇总

Django REST framework官网 http://www.sinodocs.cn/ django中文网 https://www.django.cn/ 转载于:https://www.cnblogs.com/gcgc/p/11542068.html

Kubernetes 入门(4)集群配置

1. 集群配置 报错&#xff1a; message: ‘runtime network not ready: NetworkReadyfalse reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized’ 原因&#xff1a;cni未被初始化&#xff08;CNI 是 Container Network In…

【例9.8】合唱队形

【例9.8】合唱队形 链接&#xff1a;http://ybt.ssoier.cn:8088/problem_show.php?pid1264 时间限制: 1000 ms 内存限制: 65536 KB【题目描述】 N位同学站成一排&#xff0c;音乐老师要请其中的(N-K)位同学出列&#xff0c;使得剩下的K位同学排成合唱队形。 合唱队形是…

scrum流程 规划 冲刺_Scrum –困难的部分2:更快地冲刺

scrum流程 规划 冲刺In the first part, I presented my favorite list of Scrums hard parts and how to work around them. In the second part, I offer you a colorful bouquet of workarounds as well. Have fun!在第一部分中 &#xff0c;我介绍了我最喜欢的Scrum困难部分…

JAVA基础知识|lambda与stream

lambda与stream是java8中比较重要两个新特性&#xff0c;lambda表达式采用一种简洁的语法定义代码块&#xff0c;允许我们将行为传递到函数中。之前我们想将行为传递到函数中&#xff0c;仅有的选择是使用匿名内部类&#xff0c;现在我们可以使用lambda表达式替代匿名内部类。在…

数据库:存储过程_数据科学过程:摘要

数据库:存储过程Once you begin studying data science, you will hear something called ‘data science process’. This expression refers to a five stage process that usually data scientists perform when working on a project. In this post I will walk through ea…

901

901 转载于:https://www.cnblogs.com/Forever77/p/11542129.html

leetcode 137. 只出现一次的数字 II(位运算)

给你一个整数数组 nums &#xff0c;除某个元素仅出现 一次 外&#xff0c;其余每个元素都恰出现 三次 。请你找出并返回那个只出现了一次的元素。 示例 1&#xff1a; 输入&#xff1a;nums [2,2,3,2] 输出&#xff1a;3 示例 2&#xff1a; 输入&#xff1a;nums [0,1,0,…

【p081】ISBN号码

Time Limit: 1 second Memory Limit: 50 MB 【问题描述】 每一本正式出版的图书都有一个ISBN号码与之对应&#xff0c;ISBN码包括9位数字、1位识别码和3位分隔符&#xff0c;其规定格式如“x-xxx-xxxxx-x”&#xff0c;其中符号“-”是分隔符&#xff08;键盘上的减号&#xff…

gitlab bash_如何编写Bash一线式以克隆和管理GitHub和GitLab存储库

gitlab bashFew things are more satisfying to me than one elegant line of Bash that automates hours of tedious work. 没有什么比让Bash自动完成数小时繁琐工作的Bash优雅系列令我满意的了。 As part of some recent explorations into automatically re-creating my la…

寒假学习笔记(4)

2018.2.11 类中的常成员 关键字const&#xff0c;在类定义中声明数据成员使用关键字限定&#xff0c;声明时不能初始化。初始化列表&#xff0c;类中的任何函数都不能对常数据成员赋值&#xff0c;包括构造函数。为构造函数添加初始化列表是对常数据成员进行初始化的唯一途径。…

svm和k-最近邻_使用K最近邻的电影推荐和评级预测

svm和k-最近邻Recommendation systems are becoming increasingly important in today’s hectic world. People are always in the lookout for products/services that are best suited for them. Therefore, the recommendation systems are important as they help them ma…

Oracle:时间字段模糊查询

需要查询某一天的数据&#xff0c;但是库里面存的是下图date类型 将Oracle中时间字段转化成字符串&#xff0c;然后进行字符串模糊查询 select * from CAINIAO_MONITOR_MSG t WHERE to_char(t.CREATE_TIME,yyyy-MM-dd) like 2019-09-12 转载于:https://www.cnblogs.com/gcgc/p/…

cogs2109 [NOIP2015] 运输计划

cogs2109 [NOIP2015] 运输计划 二分答案树上差分。 STO链剖巨佬们我不会&#xff08;太虚伪了吧 首先二分一个答案&#xff0c;下界为0,上界为max{路径长度}。 然后判断一个答案是否可行&#xff0c;这里用到树上差分。 &#xff08;阔以理解为前缀和&#xff1f;&#xff1f;&…