在 Elasticsearch 中使用 Amazon Nova 模型

作者:来自 Elastic Andre Luiz

了解如何在 Elasticsearch 中使用 Amazon Nova 系列模型。

在本文中,我们将讨论 Amazon 的 AI 模型家族——Amazon Nova,并学习如何将其与 Elasticsearch 结合使用。

关于 Amazon Nova

Amazon Nova 是 Amazon 的一系列人工智能模型,可在 Amazon Bedrock 上使用,旨在提供高性能和成本效益。这些模型支持文本、图像和视频输入,生成文本输出,并针对不同的准确性、速度和成本需求进行了优化。

Amazon Nova 主要模型

  • Amazon Nova Micro:专注于文本处理的快速、经济高效模型,适用于翻译、推理、代码补全和数学问题求解。其生成速度超过 200 个 token 每秒,非常适合需要即时响应的应用。

  • Amazon Nova Lite:一种低成本的多模态模型,可快速处理图像、视频和文本。其速度和准确性表现突出,适用于交互式和高数据量的应用,尤其是成本敏感的场景。

  • Amazon Nova Pro:最高级的选择,结合了高准确性、速度和成本效益。适用于视频摘要、问答、软件开发和 AI 代理等复杂任务。专家评测表明,它在文本和视觉理解方面表现卓越,并且能够遵循指令执行自动化工作流。

Amazon Nova 模型适用于多种应用场景,包括内容创作、数据分析、软件开发以及基于 AI 的流程自动化。

我们将展示如何将 Amazon Nova 模型与 Elasticsearch 结合使用,以实现自动化的产品评论分析。

我们将进行以下步骤:
  1. 通过 Inference API 创建一个端点,将 Amazon Bedrock 与 Elasticsearch 集成。

  2. 使用 Inference Processor 创建一个数据处理管道,该管道将调用 Inference API 端点。

  3. 索引产品评论,并使用管道自动生成评论分析。

  4. 分析集成后的结果。

在 Inference API 中创建端点

首先,我们配置 Inference API 以将 Amazon Bedrock 与 Elasticsearch 集成。我们选择 Amazon Nova Lite 作为使用的模型,其 ID 为 amazon.nova-lite-v1:0,因为它在速度、准确性和成本之间提供了良好的平衡。

注意:你需要有效的凭据才能使用 Amazon Bedrock。你可以在此处查看文档以获取访问密钥:

PUT _inference/completion/bedrock_completion_amazon_nova-lite
{"service": "amazonbedrock","service_settings": {"access_key": "#access_key#","secret_key": "#secret_key#","region": "us-east-1","provider": "amazontitan","model": "amazon.nova-lite-v1:0"}
}

创建评论分析 pipeline

现在,我们创建一个处理流水线,该流水线将使用 Inference Processor 来执行评论分析提示(prompt)。该提示会将评论数据发送到 Amazon Nova Lite,并执行以下操作:

  • 情感分类(正面、负面或中立)

  • 评论摘要生成

  • 关键词提取

  • 真实性评估(真实 | 可疑 | 泛化)

PUT /_ingest/pipeline/review_analyzer_ai
{"processors": [{"script": {"source": """ctx.prompt = "Analyze the following product review and return a structured JSON. Task: - Summarize the review concisely. - Detect and classify the sentiment as positive, neutral, or negative.- Generate relevant tags (keywords) based on the review content and detected sentiment. - Evaluate the authenticity of the review (authentic, suspicious, or generic). Review: " + ctx.review + " Respond in JSON format with the following fields: \"review_analyze\": {\"sentiment\": \"<positive | neutral | negative>\", \"authenticity\": \"<authentic | suspicious | generic>\",\"summary\": \"<short review summary>\", \"keywords\": [\"<keyword 1>\", \"<keyword 2>\", \"...\"]}}}""""}},{"inference": {"model_id": "bedrock_completion_amazon_nova-lite","input_output": {"input_field": "prompt","output_field": "result"}}},{"gsub": {"field": "result","pattern": "```json","replacement": ""} },{"json" : {"field" : "result","strict_json_parsing": false,"add_to_root" : true}},{"remove": {"field": "result"}},{"remove": {"field": "prompt"}}]
}

索引评论

现在,我们使用 Bulk API 索引产品评论。之前创建的流水线将自动应用,并将 Nova 模型生成的分析结果添加到索引的文档中。

POST bulk/
{ "index": { "_index" : "products", "_id": 1, "pipeline":"review_analyzer_ai" } }
{ "product": "Pampers Pants Premium Care Fralda", "review": "Best diaper ever! Great material, lots of cotton, without all that plastic. Doesn't leak! My baby is a boy and every diaper leaked around the waist, this model solved the problem. Even on a small baby it's worth the effort of putting on the short diaper. I put it on my baby at 9 pm and only take it off in the morning, without any leaks." }
{ "index": { "_index" : "products", "_id": 2, "pipeline":"review_analyzer_ai" } }
{ "product": "Portable Electric Body Massager", "review": "It broke in three months for no apparent reason, thank goodness I didn't review it before. I don't recommend buying it because it has a short lifespan." }
{ "index": { "_index" : "products", "_id": 3, "pipeline":"review_analyzer_ai" } }
{ "product": "Havit Fuxi-H3 Black Quad-Mode Wired and Wireless Gaming Headset", "review": "The sound is good for the price, but the connectivity is horrible. You always need to be playing audio, otherwise it loses connection (I work from home, and this is very annoying). Sometimes it loses connection and you have to turn it off and on again to get it back on. The microphone is very sensitive, so it loses connection frequently and you have to turn the headset off and on for the microphone to work again. The flexibility of the stem is useless, because if you move it, the microphone can turn off. Sometimes I need to use Linux and the headset simply doesn't work. It's light and comfortable, the sound is adequate, but the connectivity is terrible." }
{ "index": { "_index" : "products", "_id": 4, "pipeline":"review_analyzer_ai" } }
{ "product": "Air Fryer 4L Oil Free Fryer Mondial", "review": "For those looking for value for money, it's a good option, but the tray (which is underneath the perforated basket) is already peeling a lot. My mother has one just like it and said that hers is even rusting, in other words, the material is MUCH inferior. There's also something that bothers me, because it looks like a microwave, it doesn't fry evenly, it's weaker in the middle and stronger on the sides. Buy at your own risk." }

查询和分析结果

最后,我们运行查询以查看 Amazon Nova Lite 模型如何分析和分类评论。通过执行 GET products/_search,我们可以获取已经被评论内容增强的文档。

该模型能够识别主要情感(正面、中立或负面),生成简要摘要,提取相关关键词,并评估每条评论的 真实性。这些字段有助于理解客户的意见,而无需阅读完整文本。

在解释结果时,我们关注以下方面:

  • 情感:指示消费者对产品的整体感受。

  • 摘要:提炼评论中提及的主要观点。

  • 关键词:可用于分组相似评论或识别反馈模式。

  • 真实性:判断评论是否可信,对内容审核或筛选有帮助。

   "hits": [{"_index": "products","_id": "1","_score": 1,"_ignored": ["review.keyword"],"_source": {"product": "Pampers Pants Premium Care Fralda","model_id": "bedrock_completion_amazon_nova-lite","review_analyze": {"summary": "The reviewer praises the diaper for its great material, high cotton content, and leak-proof design, especially highlighting its effectiveness for their baby.","sentiment": "positive","keywords": ["best diaper","great material","cotton","no plastic","leak-proof","baby","effective"],"authenticity": "authentic"},"review": "Best diaper ever! Great material, lots of cotton, without all that plastic. Doesn't leak! My baby is a boy and every diaper leaked around the waist, this model solved the problem. Even on a small baby it's worth the effort of putting on the short diaper. I put it on my baby at 9 pm and only take it off in the morning, without any leaks."}},{"_index": "products","_id": "2","_score": 1,"_source": {"product": "Portable Electric Body Massager","model_id": "bedrock_completion_amazon_nova-lite","review_analyze": {"summary": "The product broke in three months for no apparent reason and the reviewer does not recommend it due to its short lifespan.","sentiment": "negative","keywords": ["broke","short lifespan","not recommend"],"authenticity": "authentic"},"review": "It broke in three months for no apparent reason, thank goodness I didn't review it before. I don't recommend buying it because it has a short lifespan."}},{"_index": "products","_id": "3","_score": 1,"_ignored": ["review.keyword"],"_source": {"product": "Havit Fuxi-H3 Black Quad-Mode Wired and Wireless Gaming Headset","model_id": "bedrock_completion_amazon_nova-lite","review_analyze": {"summary": "The headset has good sound quality for the price but suffers from poor connectivity, especially when using the microphone or moving the headset. It also has compatibility issues with Linux.","sentiment": "negative","keywords": ["sound","connectivity","microphone","compatibility","annoying","turn off and on","Linux","flexible stem","work from home"],"authenticity": "authentic"},"review": "The sound is good for the price, but the connectivity is horrible. You always need to be playing audio, otherwise it loses connection (I work from home, and this is very annoying). Sometimes it loses connection and you have to turn it off and on again to get it back on. The microphone is very sensitive, so it loses connection frequently and you have to turn the headset off and on for the microphone to work again. The flexibility of the stem is useless, because if you move it, the microphone can turn off. Sometimes I need to use Linux and the headset simply doesn't work. It's light and comfortable, the sound is adequate, but the connectivity is terrible."}},{"_index": "products","_id": "4","_score": 1,"_ignored": ["review.keyword"],"_source": {"product": "Air Fryer 4L Oil Free Fryer Mondial","model_id": "bedrock_completion_amazon_nova-lite","review_analyze": {"summary": "The product offers value for money but has issues with peeling, rusting, and uneven frying.","sentiment": "negative","keywords": ["value for money","peeling","rusting","uneven frying","weaker in the middle"],"authenticity": "authentic"},"review": "For those looking for value for money, it's a good option, but the tray (which is underneath the perforated basket) is already peeling a lot. My mother has one just like it and said that hers is even rusting, in other words, the material is MUCH inferior. There's also something that bothers me, because it looks like a microwave, it doesn't fry evenly, it's weaker in the middle and stronger on the sides. Buy at your own risk."}}]

最终想法

Amazon Nova LiteElasticsearch 的集成展示了语言模型如何将原始评论转化为结构化且有价值的信息。通过流水线处理评论,我们能够自动且一致地提取 情感、真实性、摘要关键词

结果表明,该模型能够理解评论的上下文、分类用户的意见,并突出显示每个体验中最相关的点。这使数据集更加丰富,可用于提升搜索能力。

想要获得 Elastic 认证?查看下一次 Elasticsearch Engineer 培训时间!

Elasticsearch 拥有众多新功能,可帮助你构建最佳搜索解决方案。探索我们的示例 notebooks 了解更多信息,开启 免费云试用,或立即在本地机器尝试 Elastic

原文:https://www.elastic.co/search-labs/blog/amazon-nova-models-elasticsearch

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/bicheng/75952.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

MySQL8.0.40编译安装(Mysql8.0.40 Compilation and Installation)

MySQL8.0.40编译安装 近期MySQL发布了8.0.40版本&#xff0c;与之前的版本相比&#xff0c;部分依赖包发生了变化&#xff0c;因此重新编译一版&#xff0c;也便于大家参考。 1. 下载源码 选择对应的版本、选择源码、操作系统 如果没有登录或者没有MySQL官网账号&#xff0…

python中pyside6多个py文件生成exe

网上见到的教程大多数都是pyinstaller安装单个py文件,针对多个py文件的打包,鲜有人提及;有也是部分全而多的解释,让人目不暇接,本次记录自己设置一个声波捕捉界面的打包过程。 1.pycharm中调用pyinstaller打包 参考链接:https://blog.csdn.net/weixin_45793544/articl…

Java中使用Function Call实现AI大模型与业务系统的集成​

这个理念实际上很早就出现了&#xff0c;只不过早期的模型推理理解能力比较差&#xff0c;用户理解深度预测不够&#xff0c;现在每天的迭代有了改进&#xff0c;逐步引入到我们本身的业务系统&#xff0c;让AI大模型集成进来管理自身业务功能。当然现在也不是一个什么难事了。…

id 属性自动创建 js 全局变量

给一个元素设置 id 属性&#xff0c;它会在 js 中创建全局变量&#xff0c;如 <div class"test" click"test" id"idTest">test</div>test() {console.log(idTest:, window.idTest) }.test {height: 50px;width: 200px;background-c…

Android SELinux权限使用

Android SELinux权限使用 一、SELinux开关 adb在线修改seLinux(也可以改配置文件彻底关闭) $ getenforce; //获取当前seLinux状态,Enforcing(表示已打开),Permissive(表示已关闭) $ setenforce 1; //打开seLinux $ setenforce 0; //关闭seLinux二、命令查看sel…

【R语言绘图】圈图绘制代码

绘制代码 rm(list ls())# 加载必要包 library(data.table) library(circlize) library(ComplexHeatmap) library(rtracklayer) library(GenomicRanges) library(BSgenome) library(GenomicFeatures) library(dplyr)### 数据准备阶段 ### # 1. 读取染色体长度信息 df <- re…

vim 编辑器 使用教程

Vim是一款强大的文本&#xff08;代码&#xff09;编辑器&#xff0c;它是由Bram Moolenaar于1991年开发完成。它的前身是Bill Joy开发的vi。名字的意义是Vi IMproved。 打开vim&#xff0c;直接在命令行输入vim即可&#xff0c;或者vim <filename>. Vim分为四种模式&a…

C++20新增内容

C20 是 C 语言的一次重大更新&#xff0c;它引入了许多新特性&#xff0c;使代码更现代化、简洁且高效。以下是 C20 的主要新增内容&#xff1a; 1. 概念&#xff08;Concepts&#xff09; 概念用于约束模板参数&#xff0c;使模板编程更加直观和安全。 #include <concept…

C++中常用的十大排序方法之4——希尔排序

成长路上不孤单&#x1f60a;&#x1f60a;&#x1f60a;&#x1f60a;&#x1f60a;&#x1f60a; 【&#x1f60a;///计算机爱好者&#x1f60a;///持续分享所学&#x1f60a;///如有需要欢迎收藏转发///&#x1f60a;】 今日分享关于C中常用的排序方法之4——希尔排序的相…

详细描述以太坊的gas、gaslimit、gasPrice

目录 一、Gas 是什么? ✅ 简要定义: 🧠 举例理解: 二、Gas Limit 是什么? ✅ 简要定义: 分两种: 举例说明: 三、Gas Price 是什么? ✅ 简要定义: 为什么它重要? 示例: 四、 EIP-1559 后的新机制(伦敦升级) 三个要素: 五、额外技巧(开发实用) 本文…

全国大学生数学建模竞赛赛题深度分析报告(2010-2024)

全国大学生数学建模竞赛赛题深度分析报告&#xff08;2010-2024&#xff09; 全国大学生数学建模竞赛(CUMCM)是中国最具影响力的大学生科技竞赛之一&#xff0c;本报告将对2010-2024年间的赛题进行全面统计分析&#xff0c;包括题目类型、领域分布、模型方法等多个维度&#x…

从奖励到最优决策:动作价值函数与价值学习

从奖励到最优决策&#xff1a;动作价值函数与价值学习 价值学习一、动作价值函数对 U t U_t Ut​求期望得到动作价值函数动作价值函数的意义最优动作价值函数(Optimal Action-Value Function)如何理解 Q ∗ Q^* Q∗函数 二、价值学习的基本思想Deep Q-Network(DQN)DQN玩游戏的具…

智能手表该存什么音频和文本?场景化存储指南

文章目录 为什么需要“场景化存储”&#xff1f;智能手表的定位手机替代不了的场景碎片化的场景存储 音频篇&#xff1a;智能手表该存什么音乐和音频&#xff1f;运动场景通勤场景健康场景 文本篇&#xff1a;哪些文字信息值得放进手表&#xff1f;&#xff08;部分情况可使用图…

液态神经网络技术指南

一、引言 1.从传统神经网络到液态神经网络 神经网络作为深度学习的核心工具&#xff0c;在图像识别、自然语言处理、推荐系统等领域取得了巨大成功。尤其是卷积神经网络&#xff08;CNN&#xff09;、循环神经网络&#xff08;RNN&#xff09;、长短期记忆网络&#xff08;LS…

hive通过元数据库删除分区操作步骤

删除分区失败&#xff1a; alter table proj_60_finance.dwd_fm_ma_kpi_di_mm drop partition(year2025,month0-3,typeADJ); 1、查询分区的DB_ID、TBL_ID – 获取数据库ID-26110 SELECT DB_ID FROM DBS WHERE NAME ‘proj_60_finance’; – 获取表ID-307194 SELECT TBL_ID FR…

1990-2019年各地级市GDP数据

1990-2019年各地级市GDP数据 1、时间&#xff1a;1990-2019年 2、来源&#xff1a;城市年鉴 3、指标&#xff1a;行政区划代码、年份、省份、城市、经度、纬度、地区生产总值(万元) 4、范围&#xff1a;250地级市 5、指标解释&#xff1a;地区生产总值&#xff08;Gross R…

沧州铁狮子

又名“镇海吼”&#xff0c;是中国现存年代最久、形体最大的铸铁狮子&#xff0c;具有深厚的历史文化底蕴和独特的艺术价值。以下是关于沧州铁狮子的详细介绍&#xff1a; 历史背景 • 铸造年代&#xff1a;沧州铁狮子铸造于后周广顺三年&#xff08;953年&#xff09;&#…

《Java八股文の文艺复兴》第十一篇:量子永生架构——对象池的混沌边缘(终极试炼·完全体)

Tags: - Java高并发 - 量子架构 - 混沌工程 - 赛博修真 - 三体防御 目录&#xff1a; 卷首语&#xff1a;蝴蝶振翅引发的量子海啸 第一章&#xff1a;混沌初开——对象池的量子涅槃&#xff08;深度扩展&#xff09; 第二章&#xff1a;混沌计算——对象复活的降维打击&…

Java面试34-Kafka的零拷贝原理

在实际应用中&#xff0c;如果我们需要把磁盘中的某个文件内容发送到远程服务器上&#xff0c;那么它必须要经过几个拷贝的过程&#xff1a; 从磁盘中读取目标文件内容拷贝到内核缓冲区CPU控制器再把内核缓冲区的数据复制到用户空间的缓冲区在应用程序中&#xff0c;调用write…

TF-IDF忽略词序问题思考

自从开始做自然语言处理的业务&#xff0c;TF-IDF就是使用很频繁的文本特征技术&#xff0c;他的优点很多&#xff0c;比如&#xff1a;容易理解&#xff0c;不需要训练&#xff0c;提取效果好&#xff0c;可以给予大规模数据使用&#xff0c;总之用的很顺手&#xff0c;但是人…