php amazon-s3_推荐亚马逊电影-一种协作方法

php amazon-s3

Item-based collaborative and User-based collaborative approach for recommendation system with simple coding.

推荐系统的基于项目的协作和基于用户的协作方法,编码简单。

推荐系统概述 (Overview of Recommendation System)

There are many methods of recommendation system where each of them serve for different purposes. My previous article is talking about the simple and content-based recommendation. These recommendations are non-personalised recommenders, but that doesn’t mean they are less useful when compare to the other. These method are very popular for recommending top music of the week and recommending music of similar genre.

推荐系统的方法很多,每种方法都有不同的用途。 我的上一篇文章讨论的是基于内容的简单推荐。 这些推荐是非个性化的推荐者,但这并不意味着它们与其他推荐相比没有太大用处。 这些方法在推荐本周热门音乐和推荐类似流派的音乐时非常流行。

In this article, it will focus on collaborative filtering method. This method considers your taste in comparison to people/items that are in similar. Then, it recommends a list of items based on consumption similarity and suggest what you probably interested. These method only focus on calculating the rating.

在本文中,它将重点介绍协作过滤方法。 与相似的人/物品相比,此方法考虑了您的口味。 然后,它根据消费相似性推荐商品清单,并建议您可能感兴趣的商品。 这些方法仅专注于计算等级

There are two main filtering for this method: item-based filtering and user-based filtering. Item-based filtering will suggest items that are similar to what you have already liked. User-based filtering will suggest items that people similar to you have liked but you have not yet consumed.

此方法主要有两种过滤:基于项目的过滤和基于用户的过滤。 基于项目的过滤将建议与您喜欢的项目相似的项目。 基于用户的过滤将建议与您相似的人喜欢但尚未消耗的物品。

With the Amazon movie data, we will apply item-based filtering and user-based filtering recommendation methods to analyze similar items to be recommend and identify users that have similar taste.

借助Amazon电影数据 ,我们将应用基于项目的过滤和基于用户的过滤推荐方法来分析要推荐的相似项目并识别具有相似品味的用户。

分析概述 (Analysis Overview)

For both item-based filtering and user-based filtering recommendation, we need to clean data and prepare them into matrix so that it can be used for analysis. All ratings need to be in numbers and normalized and cosine similarity will be used to calculate items/users similarity.

对于基于项目的过滤和基于用户的过滤建议,我们都需要清理数据并将它们准备成矩阵,以便可以将其用于分析。 所有等级都必须以数字表示并进行归一化,余弦相似度将用于计算项目/用户相似度。

资料总览 (Data Overview)

There are 4,848 users with a total of 206 movies in the dataset.

数据集中有4848位用户,总共206部电影。

实作 (Implementation)

Now, lets import all tools that we are going to use for the analysis, put data into DataFrame, and clean them.

现在,让我们导入我们将用于分析的所有工具,将数据放入DataFrame并清理它们。

import pandas as pd
import numpy as np
import scipy as sp
from scipy import sparse
from sklearn.metrics.pairwise import cosine_similarityamazon = pd.read_csv('.../Amazon.csv')
amazon.info()
amazon.head()
Image for post
Image for post

Then, we need to rearrange data into matrix format where we will set index for the rows as user_id and index for the column as name.

然后,我们需要将数据重新排列为矩阵格式,在该格式中,将行的索引设置为user_id,将列的索引设置为name。

amazon = amazon.melt(id_vars=['user_id'], var_name='name', value_name='rating')
amazon_pivot = amazon.pivot_table(index=['user_id'], columns=['name'], values='rating')
amazon_pivot.head()
Image for post

From here, we need to normalized the rating values so that value range are closer to one and another. Then, turn the NaN values into 0 and select only those users who at least rate one movie.

从这里开始,我们需要对评级值进行归一化,以使值范围彼此接近。 然后,将NaN值设置为0,然后仅选择至少对一部电影评分的用户。

amazon_normalized = amazon_pivot.apply(lambda x: (x-np.min(x))/(np.max(x)-np.min(x)), axis=1)amazon_normalized.fillna(0, inplace=True)
amazon_normalized = amazon_normalized.T
amazon_normalized = amazon_normalized.loc[:, (amazon_normalized !=0).any(axis=0)]
Image for post

We nearly there. Now we need to put them into sparse matrix.

我们快到了。 现在我们需要将它们放入稀疏矩阵。

amazon_sparse = sp.sparse.csr_matrix(amazon_normalized.values)

Lets look at item-based filtering recommendation.

让我们看一下基于项目的过滤建议

item_similarity = cosine_similarity(amazon_sparse)
item_sim_df = pd.DataFrame(item_similarity, index=amazon_normalized.index, columns=amazon_normalized.index)
item_sim_df.head()
Image for post

All the columns and rows are now become each of the movie and it is ready for the recommendation calculation.

现在,所有的列和行都成为电影的每一个,并且可以进行推荐计算了。

def top_movie(movie_name):
for item in item_sim_df.sort_values(by = movie_name, ascending = False).index[1:11]:
print('Similar movie:{}'.format(item))top_movie("Movie102")
Image for post

These are the movies that are similar to Movie102.

这些是与Movie102类似的电影。

Lets look at user-based filtering recommendation. Who has similar taste to me?

让我们看一下基于用户的过滤推荐 。 谁有和我相似的品味?

user_similarity = cosine_similarity(amazon_sparse.T)
user_sim_df = pd.DataFrame(user_similarity, index = amazon_normalized.columns, columns = amazon_normalized.columns)
user_sim_df.head()
Image for post
def top_users(user):  
sim_values = user_sim_df.sort_values(by=user, ascending=False).loc[:,user].tolist()[1:11]
sim_users = user_sim_df.sort_values(by=user, ascending=False).index[1:11]
zipped = zip(sim_users, sim_values)
for user, sim in zipped:
print('User #{0}, Similarity value: {1:.2f}'.format(user, sim))top_users('A140XH16IKR4B0')
Image for post

These are the examples on how to implement the item-based and user-based filtering recommendation system. Some of the code are from https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data

这些是有关如何实施基于项目和基于用户的过滤推荐系统的示例。 一些代码来自https://www.kaggle.com/ajmichelutti/collaborative-filtering-on-anime-data

Hope that you enjoy!

希望你喜欢!

翻译自: https://medium.com/analytics-vidhya/recommend-amazon-movie-a-collaborative-approach-9b3db8f48ad6

php amazon-s3

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391226.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

python:使用Djangorestframework编写post和get接口

1、安装django pip install django 2、新建一个django工程 python manage.py startproject cainiao_monitor_api 3、新建一个app python manage.py startapp monitor 4、安装DRF pip install djangorestframework 5、编写视图函数 views.py from rest_framework.views import A…

Kubernetes 入门(3)集群安装

1. kubeadm简介 kubeadm 是 Kubernetes 官方提供的一个 CLI 工具,可以很方便的搭建一套符合官方最佳实践的最小化可用集群。当我们使用 kubeadm 搭建集群时,集群可以通过 K8S 的一致性测试,并且 kubeadm 还支持其他的集群生命周期功能&#…

【9303】平面分割

Time Limit: 10 second Memory Limit: 2 MB 问题描述 同一平面内有n(n≤500)条直线,已知其中p(p≥2)条直线相交与同一点,则这n条直线最多能将平面分割成多少个不同的区域? Input 两个整数n&am…

简述yolo1-yolo3_使用YOLO框架进行对象检测的综合指南-第一部分

简述yolo1-yolo3重点 (Top highlight)目录: (Table Of Contents:) Introduction 介绍 Why YOLO? 为什么选择YOLO? How does it work? 它是如何工作的? Intersection over Union (IoU) 联合路口(IoU) Non-max suppression 非最大抑制 Networ…

JAVA基础知识|lambda与stream

lambda与stream是java8中比较重要两个新特性,lambda表达式采用一种简洁的语法定义代码块,允许我们将行为传递到函数中。之前我们想将行为传递到函数中,仅有的选择是使用匿名内部类,现在我们可以使用lambda表达式替代匿名内部类。在…

数据库:存储过程_数据科学过程:摘要

数据库:存储过程Once you begin studying data science, you will hear something called ‘data science process’. This expression refers to a five stage process that usually data scientists perform when working on a project. In this post I will walk through ea…

svm和k-最近邻_使用K最近邻的电影推荐和评级预测

svm和k-最近邻Recommendation systems are becoming increasingly important in today’s hectic world. People are always in the lookout for products/services that are best suited for them. Therefore, the recommendation systems are important as they help them ma…

Oracle:时间字段模糊查询

需要查询某一天的数据,但是库里面存的是下图date类型 将Oracle中时间字段转化成字符串,然后进行字符串模糊查询 select * from CAINIAO_MONITOR_MSG t WHERE to_char(t.CREATE_TIME,yyyy-MM-dd) like 2019-09-12 转载于:https://www.cnblogs.com/gcgc/p/…

cnn对网络数据预处理_CNN中的数据预处理和网络构建

cnn对网络数据预处理In this article, we will go through the end-to-end pipeline of training convolution neural networks, i.e. organizing the data into directories, preprocessing, data augmentation, model building, etc.在本文中,我们将遍历训练卷积神…

leetcode 554. 砖墙

你的面前有一堵矩形的、由 n 行砖块组成的砖墙。这些砖块高度相同(也就是一个单位高)但是宽度不同。每一行砖块的宽度之和应该相等。 你现在要画一条 自顶向下 的、穿过 最少 砖块的垂线。如果你画的线只是从砖块的边缘经过,就不算穿过这块砖…

递归 和 迭代 斐波那契数列

#include "stdio.h"int Fbi(int i) /* 斐波那契的递归函数 */ { if( i < 2 ) return i 0 ? 0 : 1; return Fbi(i - 1) Fbi(i - 2); /* 这里Fbi就是函数自己&#xff0c;等于在调用自己 */ }int main() { int i; int a[40]; printf("迭代显示斐波那契数列…

飞行模式的开启和关闭

2019独角兽企业重金招聘Python工程师标准>>> if(Settings.System.getString(getActivity().getContentResolver(),Settings.Global.AIRPLANE_MODE_ON).equals("0")) { Settings.System.putInt(getActivity().getContentResolver(),Settings.Global.AIRPLA…

消解原理推理_什么是推理统计中的Z检验及其工作原理?

消解原理推理I Feel:我觉得&#xff1a; The more you analyze the data the more enlightened, data engineer you will become.您对数据的分析越多&#xff0c;您将变得越发开明。 In data engineering, you will always find an instance where you need to establish whet…

pytest+allure测试框架搭建

https://blog.csdn.net/wust_lh/article/details/86685912 https://www.jianshu.com/p/9673b2aeb0d3 定制化展示数据 https://blog.csdn.net/qw943571775/article/details/99634577 环境说明&#xff1a; jdk 1.8 python 3.5.3 allure-commandline 2.13.0 文档及下载地址&…

大学生信息安全_给大学生的信息

大学生信息安全You’re an undergraduate. Either you’re graduating soon (like me) or you’re in the process of getting your first college degree. The process is not easy and I can only assume how difficult the pressures on Masters and Ph.D. students are. Ho…

特斯拉最安全的车_特斯拉现在是最受欢迎的租车选择

特斯拉最安全的车Have you been curious to know which cars are most popular in US and what are their typical rental fares in various cities? As the head of Product and Data Science at an emerging technology start-up, Ving Rides, these were some of the quest…

WebSocket入门

WebSocket前言  WebSocket是HTML5的重要特性&#xff0c;它实现了基于浏览器的远程socket&#xff0c;它使浏览器和服务器可以进行全双工通信&#xff0c;许多浏览器&#xff08;Firefox、Google Chrome和Safari&#xff09;都已对此做了支持。 在WebSocket出现之前&#xff…

ml dl el学习_DeepChem —在生命科学和化学信息学中使用ML和DL的框架

ml dl el学习Application of Machine Learning and Deep Learning for Drug Discovery, Genomics, Microsocopy and Quantum Chemistry can create radical impact and holds the potential to significantly accelerate the process of medical research and vaccine developm…

2017-2018-1 20179215《Linux内核原理与分析》第二周作业

20179215《Linux内核原理与分析》第二周作业 这一周主要了解了计算机是如何工作的&#xff0c;包括现在存储程序计算机的工作模型、X86汇编指令包括几种内存地址的寻址方式和push、pop、call、re等几个重要的汇编指令。主要分为两部分进行这周的学习总结。第一部分对学习内容进…

Gradle复制文件/目录方法

2019独角兽企业重金招聘Python工程师标准>>> gradle复制文件/文件夹方法 复制文件 //复制IDE生成的classes.jar文件到build/libs中&#xff0c;并改名为FileUtils.jar. task copyFile(type:Copy) {delete build/libs/FileUtils.jarfrom(build/intermediates/bundles…