ComfyUI - ComfyUI 工作流中集成 SAM2 + GroundingDINO 处理图像与视频 教程

欢迎关注我的CSDN:https://spike.blog.csdn.net/
本文地址:https://spike.blog.csdn.net/article/details/143359538

免责声明:本文来源于个人知识与公开资料,仅用于学术交流,欢迎讨论,不支持转载。


SAM2

SAM2 与 GroundingDINO 结合,在图像分割和目标检测领域带来显著的进展,SAM2 实现精确的图像分割,而 GroundingDINO 则强化模型的目标检测能力,提供更加准确和细致的物体识别。在实际应用中,能够有效提升各类复杂图像处理任务的性能,协同工作提高处理速度,还确保高精度和稳定性。

ComfyUI 部署节点的 3 个步骤:

  1. 准备 节点(Node) 工程,git clone,位于 ComfyUI/custom_nodes
  2. 安装依赖包,进入工程,运行 pip install -r requirements.txt
  3. (可选) 模型提前下载,放入相应的文件夹中
  4. 重启服务,刷新页面,即可运行。

下载工程:ComfyUI-segment-anything-2、ComfyUI-Florence2、ComfyUI-KJNodes、ComfyUI-SAM2、ComfyUI-VideoHelperSuite

cd ComfyUI/custom_nodesgit clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
git clone https://github.com/kijai/ComfyUI-segment-anything-2.git
git clone https://github.com/kijai/ComfyUI-Florence2.git
git clone https://github.com/kijai/ComfyUI-KJNodes.git# v1.0 版本,被 ComfyUI-SAM2替代
# git clone https://github.com/storyicon/comfyui_segment_anything 
git clone https://github.com/neverbiasu/ComfyUI-SAM2.gitpip install -r requirements.txt

1.ComfyUI-segment-anything-2

节点:ComfyUI-segment-anything-2

准备模型:

  1. SAM2 模型 - ComfyUI/models/sam2
  2. Florence-2 模型ComfyUI/models/LLM,用于代替检测模型,例如 GroundingDINO,参考 ComfyUI-Florence2

支持处理视频流程,但是整体分割效果非常一般,而且 Points 效果也比较一般。

依赖节点:ComfyUI-Florence2、ComfyUI-KJNodes、ComfyUI-VideoHelperSuite

测试示例位于:https://github.com/kijai/ComfyUI-segment-anything-2/tree/main/examples

例如:points_segment_video_example.json

  • Load Video (Upload),加载视频节点
  • Points Editor,Point 编辑节点,使用 shift + 左右键,选择正负点。
  • (Down)Load SAM2Model,下载或加载模型,sam2.1_hiera_large-fp16.safetensors,选择 fp16
  • Sam2Segmentation 分割节点,注意,需要重新添加,默认流程有问题,接受正负点。
  • Preview Animation 显示动画效果

即:
Img

2.ComfyUI-SAM2

节点:ComfyUI-SAM2

准备模型:models/bert-base-uncasedmodels/grounding-dinomodels/sams

GroundingDino + SAM2,只有 3 个节点,功能比较单一,检测效果较好。

  • GroundingDinoModelLoader (segment anything2),加载 DINO 模型
  • SAM2ModelLoader (segment anything2),加载 SAM2 模型
  • GroundingDinoSAM2Segment (segment anything2),合并,只有2个参数,Prompt 和 阈值

测试模型效果,支持多个词汇,例如 person 和 book,注意逗号分割,即:

Img

效果如下:

SAM2


Workflow1:

{"last_node_id":117,"last_link_id":62,"nodes":[{"id":113,"type":"Note","pos":{"0":56,"1":-415},"size":{"0":309.1065368652344,"1":177.01339721679688},"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[],"properties":{"text":""},"widgets_values":["To get the image for the points editor, first create a canvas, then either input image/video (first frame is taken), or copy/paste an image while the node is selected, or drag&drop an image.\n\nWARNING: the image WILL BE SAVED to the node in compressed format, including when saving the workflow!\n\nClick the ? on the node for more information"],"color":"#432","bgcolor":"#653"},{"id":116,"type":"Reroute","pos":{"0":1066,"1":115},"size":[75,26],"flags":{},"order":5,"mode":0,"inputs":[{"name":"","type":"*","link":60,"label":"","widget":{"name":"value"}}],"outputs":[{"name":"","type":"STRING","links":[61],"slot_index":0,"label":""}],"properties":{"showOutputText":false,"horizontal":false}},{"id":112,"type":"ShowText|pysssss","pos":{"0":1166,"1":-429},"size":{"0":315,"1":100},"flags":{},"order":4,"mode":0,"inputs":[{"name":"text","type":"STRING","link":53,"widget":{"name":"text"},"label":"text"}],"outputs":[{"name":"STRING","type":"STRING","links":null,"shape":6,"label":"STRING"}],"properties":{"Node name for S&R":"ShowText|pysssss"},"widgets_values":["","[{\"x\": 256, \"y\": 256}, {\"x\": 237, \"y\": 463}, {\"x\": 321, \"y\": 138}]"]},{"id":117,"type":"ShowText|pysssss","pos":{"0":1163,"1":-277},"size":{"0":315,"1":76},"flags":{},"order":6,"mode":0,"inputs":[{"name":"text","type":"STRING","link":62,"widget":{"name":"text"},"label":"text"}],"outputs":[{"name":"STRING","type":"STRING","links":null,"shape":6,"label":"STRING"}],"properties":{"Node name for S&R":"ShowText|pysssss"},"widgets_values":["","[{\"x\": 0, \"y\": 0}, {\"x\": 426, \"y\": 242}]"]},{"id":102,"type":"VHS_LoadVideo","pos":{"0":14,"1":-59},"size":[363.24957275390625,619.2495727539062],"flags":{},"order":1,"mode":0,"inputs":[{"name":"meta_batch","type":"VHS_BatchManager","link":null,"shape":7,"label":"meta_batch"},{"name":"vae","type":"VAE","link":null,"shape":7,"label":"vae"}],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[43,52,57],"slot_index":0,"shape":3,"label":"IMAGE"},{"name":"frame_count","type":"INT","links":null,"shape":3,"label":"frame_count"},{"name":"audio","type":"AUDIO","links":null,"shape":3,"label":"audio"},{"name":"video_info","type":"VHS_VIDEOINFO","links":null,"shape":3,"label":"video_info"}],"properties":{"Node name for S&R":"VHS_LoadVideo"},"widgets_values":{"video":"2851_1708350515(原视频).mp4","force_rate":0,"force_size":"512x?","custom_width":512,"custom_height":512,"frame_load_cap":16,"skip_first_frames":0,"select_every_nth":3,"choose video to upload":"image","videopreview":{"hidden":false,"paused":false,"params":{"frame_load_cap":16,"skip_first_frames":0,"force_rate":0,"filename":"2851_1708350515(原视频).mp4","type":"input","format":"video/mp4","select_every_nth":3,"force_size":"512x?"}}}},{"id":114,"type":"PointsEditor","pos":{"0":439,"1":-477},"size":[557,812],"flags":{"collapsed":false},"order":3,"mode":0,"inputs":[{"name":"bg_image","type":"IMAGE","link":52,"shape":7,"label":"bg_image"}],"outputs":[{"name":"positive_coords","type":"STRING","links":[53,55],"slot_index":0,"shape":3,"label":"positive_coords"},{"name":"negative_coords","type":"STRING","links":[60,62],"slot_index":1,"shape":3,"label":"negative_coords"},{"name":"bbox","type":"BBOX","links":null,"slot_index":2,"shape":3,"label":"bbox"},{"name":"bbox_mask","type":"MASK","links":null,"shape":3,"label":"bbox_mask"},{"name":"cropped_image","type":"IMAGE","links":null,"shape":3,"label":"cropped_image"}],"properties":{"Node name for S&R":"PointsEditor","imgData":{"name":"bg_image","base64":[""]},"points":"PointsEditor","neg_points":"PointsEditor"},"widgets_values":["{\"positive\":[{\"x\":256,\"y\":256},{\"x\":237,\"y\":463},{\"x\":321,\"y\":138}],\"negative\":[{\"x\":0,\"y\":0},{\"x\":426,\"y\":242}]}","[{\"x\":256,\"y\":256},{\"x\":237,\"y\":463},{\"x\":321,\"y\":138}]","[{\"x\":0,\"y\":0},{\"x\":426,\"y\":242}]","[{}]","[{}]","xyxy",512,512,false,null,null,null]},{"id":106,"type":"DownloadAndLoadSAM2Model","pos":{"0":459,"1":393},"size":{"0":315,"1":130},"flags":{},"order":2,"mode":0,"inputs":[],"outputs":[{"name":"sam2_model","type":"SAM2MODEL","links":[56],"shape":3,"label":"sam2_model"}],"properties":{"Node name for S&R":"DownloadAndLoadSAM2Model"},"widgets_values":["sam2.1_hiera_large.safetensors","video","cuda","fp16"]},{"id":115,"type":"Sam2Segmentation","pos":{"0":898,"1":393},"size":{"0":315,"1":190},"flags":{},"order":7,"mode":0,"inputs":[{"name":"sam2_model","type":"SAM2MODEL","link":56,"label":"sam2_model"},{"name":"image","type":"IMAGE","link":57,"label":"image"},{"name":"bboxes","type":"BBOX","link":null,"shape":7,"label":"bboxes"},{"name":"mask","type":"MASK","link":null,"shape":7,"label":"mask"},{"name":"coordinates_positive","type":"STRING","link":55,"widget":{"name":"coordinates_positive"},"shape":7,"label":"coordinates_positive"},{"name":"coordinates_negative","type":"STRING","link":61,"widget":{"name":"coordinates_negative"},"shape":7,"label":"coordinates_negative"}],"outputs":[{"name":"mask","type":"MASK","links":[59],"slot_index":0,"label":"mask"}],"properties":{"Node name for S&R":"Sam2Segmentation"},"widgets_values":[true,"","",false]},{"id":107,"type":"PreviewAnimation","pos":{"0":1340,"1":-59},"size":{"0":514.92431640625,"1":577.3973999023438},"flags":{},"order":8,"mode":0,"inputs":[{"name":"images","type":"IMAGE","link":43,"shape":7,"label":"images"},{"name":"masks","type":"MASK","link":59,"slot_index":1,"shape":7,"label":"masks"}],"outputs":[],"title":"Preview Animation 16x512x512","properties":{"Node name for S&R":"PreviewAnimation"},"widgets_values":[16,null]}],"links":[[43,102,0,107,0,"IMAGE"],[52,102,0,114,0,"IMAGE"],[53,114,0,112,0,"STRING"],[55,114,0,115,4,"STRING"],[56,106,0,115,0,"SAM2MODEL"],[57,102,0,115,1,"IMAGE"],[59,115,0,107,1,"MASK"],[60,114,1,116,0,"*"],[61,116,0,115,5,"STRING"],[62,114,1,117,0,"STRING"]],"groups":[],"config":{},"extra":{"ds":{"scale":0.5131581182307067,"offset":[396.07947776523474,760.0658700441401]}},"version":0.4}

Workflow2:

{"last_node_id":8,"last_link_id":7,"nodes":[{"id":2,"type":"SAM2ModelLoader (segment anything2)","pos":{"0":109,"1":303},"size":{"0":441,"1":58},"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[{"name":"SAM2_MODEL","type":"SAM2_MODEL","links":[1],"slot_index":0,"label":"SAM2_MODEL"}],"properties":{"Node name for S&R":"SAM2ModelLoader (segment anything2)"},"widgets_values":["sam2_1_hiera_large.pt"]},{"id":3,"type":"LoadImage","pos":{"0":110,"1":427},"size":{"0":315,"1":314},"flags":{},"order":1,"mode":0,"inputs":[],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[3],"slot_index":0,"label":"IMAGE"},{"name":"MASK","type":"MASK","links":null,"label":"MASK"}],"properties":{"Node name for S&R":"LoadImage"},"widgets_values":["IMG_5539.JPG","image"]},{"id":7,"type":"MaskPreview+","pos":{"0":921,"1":433},"size":[210,246],"flags":{},"order":5,"mode":0,"inputs":[{"name":"mask","type":"MASK","link":7,"label":"mask"}],"outputs":[],"properties":{"Node name for S&R":"MaskPreview+"}},{"id":1,"type":"GroundingDinoModelLoader (segment anything2)","pos":{"0":104,"1":186},"size":{"0":554.4000244140625,"1":58},"flags":{},"order":2,"mode":0,"inputs":[],"outputs":[{"name":"GROUNDING_DINO_MODEL","type":"GROUNDING_DINO_MODEL","links":[2],"slot_index":0,"label":"GROUNDING_DINO_MODEL"}],"properties":{"Node name for S&R":"GroundingDinoModelLoader (segment anything2)"},"widgets_values":["GroundingDINO_SwinB (938MB)"]},{"id":6,"type":"PreviewImage","pos":{"0":575,"1":433},"size":[308.81640625,299.23828125],"flags":{},"order":4,"mode":0,"inputs":[{"name":"images","type":"IMAGE","link":6,"label":"images"}],"outputs":[],"properties":{"Node name for S&R":"PreviewImage"}},{"id":4,"type":"GroundingDinoSAM2Segment (segment anything2)","pos":{"0":683,"1":183},"size":{"0":554.4000244140625,"1":122},"flags":{},"order":3,"mode":0,"inputs":[{"name":"sam_model","type":"SAM2_MODEL","link":1,"label":"sam_model"},{"name":"grounding_dino_model","type":"GROUNDING_DINO_MODEL","link":2,"label":"grounding_dino_model"},{"name":"image","type":"IMAGE","link":3,"label":"image"}],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[6],"slot_index":0,"label":"IMAGE"},{"name":"MASK","type":"MASK","links":[7],"slot_index":1,"label":"MASK"}],"properties":{"Node name for S&R":"GroundingDinoSAM2Segment (segment anything2)"},"widgets_values":["person,book",0.3]}],"links":[[1,2,0,4,0,"SAM2_MODEL"],[2,1,0,4,1,"GROUNDING_DINO_MODEL"],[3,3,0,4,2,"IMAGE"],[6,4,0,6,0,"IMAGE"],[7,4,1,7,0,"MASK"]],"groups":[],"config":{},"extra":{"ds":{"scale":0.8264462809917354,"offset":[-12.505597656249961,-82.9064101562497]}},"version":0.4}

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/diannao/59070.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Android13预置应用及授权开发

在android13中,要预置一个对讲应用,从预置和授权,梳理了一下,以便后续查询使用。在此记录 一放置应用 我的apk应用放在vendor下面, 路径:projectroot/vendor/fly/package/apps/DMR/flydmr.apk (vendor/fl…

【密码学】全同态加密基于多项式环计算的图解

全同态加密方案提供了一种惊人的能力 —— 能够在不知道数据具体内容的情况下对数据进行计算。这使得你可以在保持潜在敏感源数据私密的同时,得出问题的答案。 这篇文章的整体结构包括多项式环相关的数学介绍,基于多项式环的加密和解密是如何工作的&…

[java][框架]springMVC(1/2)

目标 知道SpringMVC的优点编写SpringMVC入门案例使用PostMan发送请求掌握普通类型参数传递掌握POJO类型参数传递掌握json数据参数传递掌握响应json数据掌握rest风格快速开发 一、SpringMVC简介 1 SpringMVC概述 问题导入 SpringMVC框架有什么优点? 1.1 Spring…

基于STM32健康监控系统/智能手环/老人健康检测系统/心率血氧血压

基于STM32健康监控系统/智能手环/老人健康检测系统/心率血氧血压 持续更新,欢迎关注!!! 基于STM32健康监控系统/智能手环/老人健康检测系统/心率血氧血压 随着人民生活质量的提高和生活节奏的加快,人体健康监测成为全球关注的焦点之一。基于物联网的人体…

百度文心智能体:巧用汉字笔画生成与汉字搜索插件,打造一个学习汉字的教育类智能体

这篇文章,主要介绍如何巧用汉字笔画生成与汉字搜索插件,打造一个学习汉字的教育类智能体。 目录 一、教育类智能体 1.1、智能体演示 1.2、智能体插件 1.3、智能体prompt (1)角色和目标 (2)思考路径 …

Efficient Cascaded Multiscale Adaptive Network for Image Restoration 论文阅读笔记

Efficient Cascaded Multiscale Adaptive Network for Image Restoration 论文阅读笔记 这是新国立和新加坡管理大学发表在ECCV2024上的一篇image restoration的文章,提出了一个新的网络结构ECMA,从实验结果上看在超分,去噪,去模糊…

Python | Leetcode Python题解之第525题连续数组

题目: 题解: class Solution:def findMaxLength(self, nums: List[int]) -> int:# 前缀和字典: key为1的数量和0的数量的差值,value为对应坐标hashmap {0:-1}# 当前1的数量和0的数量的差值counter ans 0for i,num in enumerate(nums):# 每多一个1…

微服务架构深入理解 | 技术栈

微服务架构深入理解 | 技术栈 服务网关 服务网关是在微服务架构中扮演重要角色的组件,它是系统对外的入口,负责接收和处理客户端的请求,并将请求路由到相应的微服务。服务网关常常与API管理、负载均衡、安全认证、流量控制等功能结合&#xf…

Java日志脱敏——基于logback MessageConverter实现

背景简介 日志脱敏 是常见的安全需求,最近公司也需要将这一块内容进行推进。看了一圈网上的案例,很少有既轻量又好用的轮子可以让我直接使用。我一直是反对过度设计的,而同样我认为轮子就应该是可以让人拿去直接用的。所以我准备分享两篇博客…

目标追踪DeepSort

一、卡尔曼滤波 你可以在任何对某个动态系统有 “不确定信息” 的地方使用卡尔曼滤波器,并且可以对系统下一步的行为做出 “有根据的猜测”。即使混乱的现实干扰了你所猜测的干净运动,卡尔曼滤波器通常也能很好地确定实际发生了什么。它还可以利用你可能…

数据结构与算法——Java实现 53.力扣938题——二叉搜索树的范围和

生命的意义 在于活出自我 而不是成为别人眼中的你 —— 24.11.3 938. 二叉搜索树的范围和 给定二叉搜索树的根结点 root,返回值位于范围 [low, high] 之间的所有结点的值的和。 示例 1: 输入:root [10,5,15,3,7,null,18], low 7, high 15 …

微信小程序scroll-view吸顶css样式化表格的表头及iOS上下滑动表头的颜色覆盖、z-index应用及性能分析

微信小程序scroll-view吸顶css样式化表格的表头及iOS上下滑动表头的颜色覆盖、z-index应用及性能分析 目录 微信小程序scroll-view吸顶css样式化表格的表头及iOS上下滑动表头的颜色覆盖、z-index应用及性能分析 1、iOS在scroll-view内部上下滑动吸顶的现象 正常的上下滑动吸顶…

免费好用又好看且多端自动同步第三方终端工具Termius你值得拥有

使用目的: 本地终端功能一样,都是为了登录服务器查看日志等操作。 本地终端 优点:方便简单,无需额外下载安装、免费。 缺点:每次都需要重新登陆输入命令,步骤繁琐无法简化;不能跨端同步。 第…

Unity引擎材质球残留贴图引用的处理

大家好,我是阿赵。   这次来分享一下Unity引擎材质球残留贴图引用的处理 一、 问题 在使用Unity调整美术效果的时候,我们很经常会有这样的操作,比如: 1、 同一个材质球切换不同的Shader、 比如我现在有2个Shader,…

【electron+vue3】使用JustAuth实现第三方登录(前后端完整版)

实现过程 去第三方平台拿到client-id和client-secret,并配置一个能够外网访问回调地址redirect-uri供第三方服务回调搭建后端服务,引入justauth-spring-boot-starter直接在配置文件中定义好第一步的三个参数,并提供获取登录页面的接口和回调…

Jetson OrinNX平台CSI相机导致cpu load average升高问题调试

1. 前言 硬件: Orin NX JP: 5.1.2, R35.4.1 用v4l2-ctl --stream-mmap -d0 命令去获取相机数据时, 用top查看cpu使用情况, CPU占用率很低,但load average在1左右, 无任何程序运行时,load average 为0 用ps -aux 查看当前进程情况,发现有两个系统进程vi-output, …

第六十三周周报 GGNN

文章目录 week63 GGNN摘要Abstract一、文献阅读1. 题目2. abstract3. 网络架构3.1 数据处理部分3.2 门控图神经网络3.3 掩码操作 4. 文献解读4.1 Introduction4.2 创新点4.3 实验过程4.3.1 传感器设置策略4.3.2 数据集4.3.3 实验设置4.3.4 模型参数设置4.3.5 实验结果 5. 结论总…

【Linux】从零开始使用多路转接IO --- poll

碌碌无为,则余生太长; 欲有所为,则人生苦短。 --- 中岛敦 《山月记》--- 从零开始使用多路转接IO 1 前言1 poll接口介绍3 代码编写4 总结 1 前言 上一篇文章我们学习了多路转接中的Select,其操作很简单,但有一些缺…

Verilog实现的莫尔斯电码发生器

莫尔斯或者摩尔斯电码(Morse Code),发明于1837年(另有一说是1836年),通过不同的排列顺序来表达不同的英文字母、数字和标点符号,在这里作一简单处理,仅产生点(Dit)和划(Dah),时长在0.25秒之内为点,超过为划…

Maven介绍,IDEA集成方式

概述 什么是Maven? Maven 的正确发音是[ˈmevən],Maven在美国是一个口语化的词语,代表专家、内行的意思。 一个对 Maven 比较正式的定义是这么说的: Maven 是一个项目管理工具,它包含了一个项目对象模型 (POM:Proj…