vLLM 部署大模型

1 介绍

vLLM 是来自 UC Berkeley 的 LMSYS 在 LLM 推理方面的最新工作(没错就是搞出 Vicuna 的那个 group),最大亮点是采用 Paged Attention 技术,结合 Continuous Batching,极大地优化了 realtime 场景下的 LLM serving 的 throughput 与内存使用。

vllm github 仓库

1.1 安装

安装命令:

pip3 install vllm# vllm==0.2.7
# transformers==4.36.2
# requests==2.31.0
# gradio==4.14.0

2 使用

2.1 线下批量推理

线下批量推理:为输入的prompts列表,使用vLLM生成答案。

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "6,7"from vllm import LLM, SamplingParamsllm = LLM('/data-ai/model/llama2/llama2_hf/Llama-2-13b-chat-hf')INFO 01-18 08:13:26 llm_engine.py:70] Initializing an LLM engine with config: model='/data-ai/model/llama2/llama2_hf/Llama-2-13b-chat-hf', tokenizer='/data-ai/model/llama2/llama2_hf/Llama-2-13b-chat-hf', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=4096, download_dir=None, load_format=auto, tensor_parallel_size=1, quantization=None, enforce_eager=False, seed=0)
INFO 01-18 08:13:37 llm_engine.py:275] # GPU blocks: 3418, # CPU blocks: 327
INFO 01-18 08:13:39 model_runner.py:501] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 01-18 08:13:39 model_runner.py:505] CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing `gpu_memory_utilization` or enforcing eager mode.
INFO 01-18 08:13:44 model_runner.py:547] Graph capturing finished in 5 secs.prompts = ["Hello, my name is","The president of the United States is","The capital of France is","The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)outputs = llm.generate(prompts, sampling_params)# Print the outputs.
for output in outputs:prompt = output.promptgenerated_text = output.outputs[0].textprint(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")Processed prompts: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 11.76it/s]Prompt: 'Hello, my name is', Generated text: " Sherry and I'm a stay at home mom of three beautiful children."
Prompt: 'The president of the United States is', Generated text: ' one of the most powerful people in the world, and yet, many people do'
Prompt: 'The capital of France is', Generated text: ' Paris. This is a fact that is well known to most people, but there'
Prompt: 'The future of AI is', Generated text: ' likely to be shaped by a combination of technological advancements and soci'
2.2 API Server服务

vLLM可以部署为API服务,web框架使用FastAPI。API服务使用AsyncLLMEngine类来支持异步调用。

使用命令 python -m vllm.entrypoints.api_server --help 可查看支持的脚本参数。

API服务启动命令:

CUDA_VISIBLE_DEVICES=6,7 python -m vllm.entrypoints.api_server --model /data-ai/model/llama2/llama2_hf/Llama-2-13b-chat-hf

输入:

curl http://localhost:8000/generate \-d '{"prompt": "San Francisco is a","use_beam_search": true,"n": 4,"temperature": 0}'

输出:

{"text": ["San Francisco is a city of neighborhoods, each with its own unique character and charm. Here are","San Francisco is a city in California that is known for its iconic landmarks, vibrant","San Francisco is a city of neighborhoods, each with its own unique character and charm. From the","San Francisco is a city in California that is known for its vibrant culture, diverse neighborhoods"]
}
2.3 OpenAI风格的API服务

启动命令:

CUDA_VISIBLE_DEVICES=6,7 python -m vllm.entrypoints.openai.api_server --model /data-ai/model/llama2/llama2_hf/Llama-2-13b-chat-hf --served-model-name llama-2-13b-chat-hf

并且还可指定对话模板(chat-template)。

2.3.1 查看模型
curl http://localhost:8000/v1/models

输出:

{"object": "list","data": [{"id": "llama-2-13b-chat-hf","object": "model","created": 1705568412,"owned_by": "vllm","root": "llama-2-13b-chat-hf","parent": null,"permission": [{"id": "modelperm-d7ca4aa0eee44eb4a50e37eba06e520d","object": "model_permission","created": 1705568412,"allow_create_engine": false,"allow_sampling": true,"allow_logprobs": true,"allow_search_indices": false,"allow_view": true,"allow_fine_tuning": false,"organization": "*","group": null,"is_blocking": false}]}]
}
2.3.2 text completion

输入:

curl http://localhost:8000/v1/completions \-H "Content-Type: application/json" \-d '{"model": "llama-2-13b-chat-hf","prompt": "San Francisco is a","max_tokens": 7,"temperature": 0}' | jq .

输出:

{"id": "cmpl-d1ba6b9f1551443e87d80258a3bedad1","object": "text_completion","created": 19687093,"model": "llama-2-13b-chat-hf","choices": [{"index": 0,"text": " city that is known for its v","logprobs": null,"finish_reason": "length"}],"usage": {"prompt_tokens": 5,"total_tokens": 12,"completion_tokens": 7}
}
2.3.3 chat completion

输入:

curl http://localhost:8000/v1/chat/completions \-H "Content-Type: application/json" \-d '{"model": "llama-2-13b-chat-hf","messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Who won the world series in 2020?"}]}' | jq .

输出:

{"id": "cmpl-94fc8bc170be4c29982a08aa6f01e298","object": "chat.completion","created": 19687353,"model": "llama-2-13b-chat-hf","choices": [{"index": 0,"message": {"role": "assistant","content": "  Hello! I'm happy to help! The Washington Nationals won the World Series in 2020. They defeated the Houston Astros in Game 7 of the series, which was played on October 30, 2020."},"finish_reason": "stop"}],"usage": {"prompt_tokens": 40,"total_tokens": 95,"completion_tokens": 55}
}

3 vLLM 实践

3.1 离线推理
from vllm import LLMprompts = ["Hello, my name is", "The capital of France is"]  # Sample prompts.
llm = LLM(model="lmsys/vicuna-7b-v1.3")  # Create an LLM.
outputs = llm.generate(prompts)  # Generate texts from the prompts.
3.2 在线服务启动
python -m vllm.entrypoints.openai.api_server --model lmsys/vicuna-7b-v1.3
3.2.1 在线服务调用
curl http://localhost:8000/v1/completions \-H "Content-Type: application/json" \-d '{"model": "lmsys/vicuna-7b-v1.3","prompt": "San Francisco is a","max_tokens": 7,"temperature": 0}'

3.3 大模型简单问答

vLLM暂不支持同时部署多个大模型,但是可以采用一次部署一个模型,部署多次的方法来实现部署多个大模型,这里采用llama-2-13b-chat-hf和Baichuan2-13B-Chat.

模型部署的命令如下:

CUDA_VISIBLE_DEVICES=6 python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 50072 --model /data-ai/model/llama2/llama2_hf/Llama-2-13b-chat-hf --served-model-name llama-2-13b-chat-hfCUDA_VISIBLE_DEVICES=7 python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 50073 --model /data-ai/model/baichuan2/Baichuan2-13B-Chat --served-model-name Baichuan2-13B-Chat --trust-remote-code --chat-template /data-ai/usr/code/template_baichuan.jinja

其中,template_baichuan.jinja(对话模板)采用vLLM在github官方网站中的examples文件夹下的同名文件。

使用Gradio来构建页面,主要实现大模型问答功能,Python代码如下:

 # -*- coding: utf-8 -*-# @place: Pudong, Shanghai# @file: gradio_for_llm.py# @time: 2024/1/19 13:30import gradio as grimport requestsmodels = ['llama-2-13b-chat-hf', 'Baichuan2-13B-Chat']def completion(question):model_url_dict = {models[0]: "http://localhost:50072/v1/chat/completions",models[1]: "http://localhost:50073/v1/chat/completions",}answers = []for model in models:headers = {'Content-Type': 'application/json'}json_data = {'model': model,'messages': [{'role': 'system','content': 'You are a helpful assistant.'},{'role': 'user','content': question},],}response = requests.post(model_url_dict[model], headers=headers, json=json_data)answer = response.json()["choices"][0]["message"]["content"]answers.append(answer)return answersdemo = gr.Interface(fn=completion,inputs=gr.Textbox(lines=5, placeholder="input your question", label="question"),outputs=[gr.Textbox(lines=5, placeholder="answer", label=models[0]),gr.Textbox(lines=5, placeholder="answer", label=models[1])]
)demo.launch(server_name='0.0.0.0', share=True)
3.4 大模型输出TPS

衡量大模型部署工具的指标之一为TPS(Token Per Second),即每秒模型输出的token数量。

我们以llama-2-13b-chat-hf,测试数据集参考网站中的问题集:https://modal.com/docs/examples/vllm_inference,,一共59个问题。

Python代码如下:

# -*- coding: utf-8 -*-# @place: Pudong, Shanghai# @file: gradio_for_throughput.py# @time: 2024/1/19 16:05import gradio as grimport requestsimport timequestions = [# Coding questions"Implement a Python function to compute the Fibonacci numbers.","Write a Rust function that performs binary exponentiation.","How do I allocate memory in C?","What are the differences between Javascript and Python?","How do I find invalid indices in Postgres?","How can you implement a LRU (Least Recently Used) cache in Python?","What approach would you use to detect and prevent race conditions in a multithreaded application?","Can you explain how a decision tree algorithm works in machine learning?","How would you design a simple key-value store database from scratch?","How do you handle deadlock situations in concurrent programming?","What is the logic behind the A* search algorithm, and where is it used?","How can you design an efficient autocomplete system?","What approach would you take to design a secure session management system in a web application?","How would you handle collision in a hash table?","How can you implement a load balancer for a distributed system?",# Literature"What is the fable involving a fox and grapes?","Write a story in the style of James Joyce about a trip to the Australian outback in 2083, to see robots in the beautiful desert.","Who does Harry turn into a balloon?","Write a tale about a time-traveling historian who's determined to witness the most significant events in human history.","Describe a day in the life of a secret agent who's also a full-time parent.","Create a story about a detective who can communicate with animals.","What is the most unusual thing about living in a city floating in the clouds?","In a world where dreams are shared, what happens when a nightmare invades a peaceful dream?","Describe the adventure of a lifetime for a group of friends who found a map leading to a parallel universe.","Tell a story about a musician who discovers that their music has magical powers.","In a world where people age backwards, describe the life of a 5-year-old man.","Create a tale about a painter whose artwork comes to life every night.","What happens when a poet's verses start to predict future events?","Imagine a world where books can talk. How does a librarian handle them?","Tell a story about an astronaut who discovered a planet populated by plants.","Describe the journey of a letter traveling through the most sophisticated postal service ever.","Write a tale about a chef whose food can evoke memories from the eater's past.",# History"What were the major contributing factors to the fall of the Roman Empire?","How did the invention of the printing press revolutionize European society?","What are the effects of quantitative easing?","How did the Greek philosophers influence economic thought in the ancient world?","What were the economic and philosophical factors that led to the fall of the Soviet Union?","How did decolonization in the 20th century change the geopolitical map?","What was the influence of the Khmer Empire on Southeast Asia's history and culture?",# Thoughtfulness"Describe the city of the future, considering advances in technology, environmental changes, and societal shifts.","In a dystopian future where water is the most valuable commodity, how would society function?","If a scientist discovers immortality, how could this impact society, economy, and the environment?","What could be the potential implications of contact with an advanced alien civilization?",# Math"What is the product of 9 and 8?","If a train travels 120 kilometers in 2 hours, what is its average speed?","Think through this step by step. If the sequence a_n is defined by a_1 = 3, a_2 = 5, and a_n = a_(n-1) + a_(n-2) for n > 2, find a_6.","Think through this step by step. Calculate the sum of an arithmetic series with first term 3, last term 35, and total terms 11.","Think through this step by step. What is the area of a triangle with vertices at the points (1,2), (3,-4), and (-2,5)?","Think through this step by step. Solve the following system of linear equations: 3x + 2y = 14, 5x - y = 15.",# Facts"Who was Emperor Norton I, and what was his significance in San Francisco's history?","What is the Voynich manuscript, and why has it perplexed scholars for centuries?","What was Project A119 and what were its objectives?","What is the 'Dyatlov Pass incident' and why does it remain a mystery?","What is the 'Emu War' that took place in Australia in the 1930s?","What is the 'Phantom Time Hypothesis' proposed by Heribert Illig?","Who was the 'Green Children of Woolpit' as per 12th-century English legend?","What are 'zombie stars' in the context of astronomy?","Who were the 'Dog-Headed Saint' and the 'Lion-Faced Saint' in medieval Christian traditions?","What is the story of the 'Globsters', unidentified organic masses washed up on the shores?",]def chat_completion(question):url = "http://localhost:50072/v1/chat/completions"headers = {'Content-Type': 'application/json'}json_data = {'model': "llama-2-13b-chat-hf",'messages': [{'role': 'system','content': 'You are a helpful assistant.'},{'role': 'user','content': question},],}response = requests.post(url, headers=headers, json=json_data)answer = response.json()["choices"][0]["message"]["content"]output_tokens = response.json()["usage"]["completion_tokens"]return answer, output_tokensdef slowly_reverse(texts, progress=gr.Progress()):total_token_cnt = 0progress(0, desc="starting...")q_list = texts.split('\n')s_time = time.time()data_list = []for q in progress.tqdm(q_list, desc=f"generating..."):answer, output_token = chat_completion(q)total_token_cnt += output_tokendata_list.append([q, answer[:50], total_token_cnt/(time.time() - s_time)])print(f"{total_token_cnt/(time.time() - s_time)} TPS")return data_listdemo = gr.Interface(fn=slowly_reverse,# 自定义输入框inputs=gr.Textbox(value='\n'.join(questions), label="questions"),# 设置输出组件outputs=gr.DataFrame(label='Table', headers=['question', 'answer', 'TPS'], interactive=True, wrap=True)
)demo.queue().launch(server_name='0.0.0.0', share=True)

4 vLLM 离线推理流程

4.1 Create sampling_params

根据用户设置,创建采样参数类对象 sampling_params,只指定 temperature=0.8, top_p=0.95 的情况下,其他默认值如下所示:

SamplingParams(n=1,best_of=1,presence_penalty=0.0,frequency_penalty=0.0,temperature=0.8,top_p=0.95,top_k=-1,use_beam_search=False,stop=[],ignore_eos=False,max_tokens=16,logprobs=None)

4.2 Create an LLM

LLM 类对象的构造函数中,首先创建 EngineArgs 类对象 engine_args 如下:

EngineArgs(model='/bigdata/shared/models/huggingface/opt-125m',tokenizer='/bigdata/shared/models/huggingface/opt-125m',tokenizer_mode='auto',trust_remote_code=False,download_dir=None,use_np_weights=False,use_dummy_weights=False,dtype='auto',seed=0,worker_use_ray=False,pipeline_parallel_size=1,tensor_parallel_size=1,block_size=16,swap_space=4,gpu_memory_utilization=0.9,max_num_batched_tokens=2560,max_num_seqs=256,disable_log_stats=True,quant_mode=None)

然后基于 engine_args ,构造 LLM 类内核心变量 llm_engine ,最后添加一个类内计数器 request_counter

self.llm_engine = LLMEngine.from_engine_args(engine_args)
self.request_counter = Counter()

4.3 Generate

在 LLM.generate 的处理过程中,核心操作分为两步。

第一步是调用 LLM._add_request ,通过 LLM.llm_engine.add_request 将用户传入的请求添加到请求列表中,添加完后,请求列表 LLM.llm_engine.scheduler.waiting 中内容如下:

[ \SequenceGroup(request_id=0, sampling_params=SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, temperature=0.8, top_p=0.95, top_k=-1, use_beam_search=False, stop=[], ignore_eos=False, max_tokens=16, logprobs=None), num_seqs=1),SequenceGroup(request_id=1, sampling_params=SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, temperature=0.8, top_p=0.95, top_k=-1, use_beam_search=False, stop=[], ignore_eos=False, max_tokens=16, logprobs=None), num_seqs=1),SequenceGroup(request_id=2, sampling_params=SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, temperature=0.8, top_p=0.95, top_k=-1, use_beam_search=False, stop=[], ignore_eos=False, max_tokens=16, logprobs=None), num_seqs=1),SequenceGroup(request_id=3, sampling_params=SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, temperature=0.8, top_p=0.95, top_k=-1, use_beam_search=False, stop=[], ignore_eos=False, max_tokens=16, logprobs=None), num_seqs=1)
]

第二步是调用 LLM._run_engine,通过 LLM.llm_engine.step(),转到 LLM.llm_engine._run_workers 函数中进行处理。

在 LLM.generate 的处理过程中,LLMEngine, Scheduler, Worker 协作配合,LLMEngine 负责总控,Scheduler 负责调度,Worker 负责执行。

4.4 vLLM 性能测试

推理系统常常需要被部署为在线服务,因此服务的稳定性、延迟、性能等指标的量化非常关键。vllm项目下benchmark目录提供了测试脚本供我们进行实验。

首先需要启动服务,与第一小节不同的是,脚本并不支持openai风格的接口:

python -m vllm.entrypoints.api_server --model /mlx/users/xingzheng.daniel/playground/model/chinese-alpaca-2-7b

然后运行脚本得到以下输出:

(torch2) ➜  benchmarks git:(main) python3 benchmark_serving.py --dataset ShareGPT_V3_unfiltered_cleaned_split.json --tokenizer  /mlx/users/xingzheng.daniel/playground/model/chinese-alpaca-2-7b --request-rate 40
Namespace(backend='vllm', host='localhost', port=8000, dataset='ShareGPT_V3_unfiltered_cleaned_split.json', tokenizer='/mlx/users/xingzheng.daniel/playground/model/chinese-alpaca-2-7b', best_of=1, use_beam_search=False, num_prompts=1000, request_rate=40.0, seed=0, trust_remote_code=False)
Total time: 165.50 s
Throughput: 6.04 requests/s
Average latency: 77.68 s
Average latency per token: 0.27 s
Average latency per output token: 1.03 s
输出的 token 总数 / 总时间: 1348.35 tokens/s

脚本逻辑总结来说,通过异步IO的方式发送N个网络请求,记录每一个单个请求所消耗的时长(latency), prompt token数量,output的token数量。最后根据这个数据计算一些统计指标。

vLLM 部署模型可能会消耗更多的显存,因为 vllm 会在初始化的时会预先分配大块内存(可以通过gpu_memory_utilization参数控制),这是因为之后所有内存分配都是在vllm内部进行,vllm接管了所有内存的分配与收回。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/792802.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Mybitis根据Date查询,查询不到数据的一种情况

使用SimpleDateFormat创建Date对象时&#xff0c;调用SimpleDateFormat构造方法时格式要写为“yyyy-MM-dd”&#xff0c;如果写成“yyyy-mm-dd”会查询不到数据

Java基础学习: 格式化之 %d,%2d, %02d

在Java中&#xff0c;%d 是用于格式化整数的格式化符号。下面是对不同使用情况的解释&#xff1a; %d&#xff1a;表示将整数值按照默认格式化方式输出。 int number 5; System.out.printf("%d", number); // 输出&#xff1a;5%2d&#xff1a;表示将整数值按照至…

2024-HW --->SSRF

这不是马上准备就要护网了嘛&#xff0c;如火如荼的报名ing&#xff01;&#xff01;&#xff01;那么小编就来查缺补漏一下以前的web漏洞&#xff0c;也顺便去收录一波poc&#xff01;&#xff01;&#xff01;&#xff01; 今天讲的主人公呢就是SSRF&#xff0c;以前学的时候…

windows server 配置DNS

配置DNS为本机IPV4地址 安装DNS服务器&#xff08;默认即可&#xff09;

Qt学习路线推荐(超硬核)

Qt 是一个跨平台的 C图形用户界面应用程序开发框架&#xff0c;它具有强大的功能和丰富的工具&#xff0c;广泛应用于桌面应用程序、移动应用程序和嵌入式系统等领域。如果你对 Qt 感兴趣并想要学习它&#xff0c;以下是一份推荐的学习路线&#xff1a; 1.基础知识学习&#x…

【C++ STL排序容器】set 集合

文章目录 【 1. 基本原理 】【 2. set 的定义 】2.1 调用默认构造函数&#xff0c;创建空的 set 容器2.2 在创建 set 容器的同时&#xff0c;对其进行初始化2.3 拷贝构造的方式创建2.4 取已有 set 容器中的部分元素&#xff0c;来初始化新 set 容器2.5 修改排序规则的方式创建 …

Kotlin:for循环的几种示例

一、 打印 0 到 2 1.1 方式一&#xff1a;0 until 3 /*** 打印 0 到 2*/ fun print0To2M1(){for (inex in 0 until 3){// 不包含3print("$inex ")} }运行结果 1.2 方式二&#xff1a;inex in 0 …2 /*** 打印 0 到 2*/ fun print0To2M2(){for (inex in 0 ..2){//…

Go语言如何使用命令行程序

命令行程序也叫命令行实用程序或工具&#xff0c;它被设计在终端运行。 在图形用户界面面世前&#xff0c;与计算机交与通常是通过命令行进行的。当前&#xff0c;对程序员和系统管理员来说&#xff0c;命令行程序依然是一种流行而实用的与底层操作系统交互的方式。出于如下原因…

HarmonyOS NEXT应用开发之ForEach:循环渲染

ForEach接口基于数组类型数据来进行循环渲染&#xff0c;需要与容器组件配合使用&#xff0c;且接口返回的组件应当是允许包含在ForEach父容器组件中的子组件。例如&#xff0c;ListItem组件要求ForEach的父容器组件必须为 List组件 。 说明&#xff1a; 从API version 9开始&a…

Rust---复合数据类型之字符串与切片(1)

目录 字符串字符串与切片字符串切片字符串操作追加&#xff08;Push&#xff09;插入 (Insert)替换 (Replace) 字符串 Rust 在语言级别&#xff0c;只有一种字符串类型&#xff1a; str&#xff0c;它通常是以引用类型出现 &str。虽然语言级别只有上述的 str 类型&#xf…

C库函数详解(一)

库函数并不是C语言的一部分。它是由人们根据需要编制并提供用户使用的。每一种C编译系统都提供了一批库函数,不同的编译系统所提供的库函数的数目和函数名以及函数功能是不完全相同的。ANSIC标准提出了一批建议提供的标准库函数。它包括了目前多数C编译系统所提供的库函数,但也…

环形链表的约瑟夫问题

著名的 Josephus 问题&#xff1a; 据说著名犹太历史学家Josephus&#xff08;弗拉维奥约瑟夫斯&#xff09;有过以下的故事&#xff1a;在罗马人占领乔塔帕特后&#xff0c;39 个犹太人与Josephus及他的朋友躲到一个洞中&#xff0c;39个犹太人决定宁愿死也不要被敌人抓到&…

【哈希表专题】(1. 两数之和 面试题 01.02. 判定是否互为字符重排 217. 存在重复元素 219. 存在重复元素 II 49. 字母异位词分组)

文章目录 哈希表1. 两数之和面试题 01.02. 判定是否互为字符重排217. 存在重复元素219. 存在重复元素 II49. 字母异位词分组 哈希表 哈希表是什么&#xff1a;存储数据的容器 作用&#xff1a;快速查找某个元素。O(1) 当我们需要频繁的查找某一个数的时候&#xff0c;可以使…

LeetCode-Java:135.分发糖果

文章目录 题目解① 穷举法&#xff0c;用时3ms&#xff0c;超过26.93%②穷举法改进&#xff0c;用时2ms&#xff0c;超过97.78% 题目 n 个孩子站成一排。给你一个整数数组 ratings 表示每个孩子的评分。 你需要按照以下要求&#xff0c;给这些孩子分发糖果&#xff1a; 每个…

图像拼接——最小割准则提取拼接缝

一、最大流问题与Ford-Fulkerson算法介绍 二、最大流与最小割 显然,我们有对任意一个割,穿过该割的净流量上界就是该割的容量,即不可能超过割的容量。所以网络的最大流必然无法超过网络的最小割。最小割是指割的容量最小,最大流是指网络当中最大的净流量,简单的例子s是水龙…

速盾:cdn高防御服务器租用有哪些好处

随着互联网的发展&#xff0c;网络安全问题日益突出。攻击者利用各种手段不断对网站进行攻击&#xff0c;给网站的安全运行带来威胁。为了保障网站的正常运行和数据的安全&#xff0c;越来越多的网站开始租用CDN高防御服务器。那么&#xff0c;租用CDN高防御服务器有哪些好处呢…

【蓝桥备赛】异或和——树状数组、DFS

题目链接 异或和 思路分析 树上每个点都有一个点权&#xff0c;对树上的更新操作是修改指定点的点权&#xff0c;查询操作是查询指定点为根结点的子树点权异或和。 这里的这些操作都和树状数组的单点修改和区间查询非常相似&#xff0c;即我们在修改一个点时&#xff0c;同时…

蓝桥杯真题:递增序列

import java.util.Scanner; // 1:无需package // 2: 类名必须Main, 不可修改 public class Main {public static int is1(char ch[][],int m,int n){int ans0;for (int i0;i<m;i){for (int j0;j<n;j){int add1;while(jadd<n){if(ch[i][j]<ch[i][jadd]) ans; //横…

SAP新的扩展策略

在软件即服务&#xff08;SaaS&#xff09;应用的推动下&#xff0c;SAP Cloud优先的战略非常明显&#xff0c;随之带来的是SAP Clean core的战略&#xff0c;从经典的 ABAP 可扩展性模式转变为 SAP S/4HANA 现代可扩展性模式。那么Clean core战略到底是什么&#xff1f;新的扩…

基于向量数据库搭建自己的搜索引擎

前言【基于chatbot】 厌倦了商业搜索引擎搜索引擎没完没了的广告&#xff0c;很多时候&#xff0c;只是需要精准高效地检索信息&#xff0c;而不是和商业广告“斗智斗勇”。以前主要是借助爬虫工具&#xff0c;而随着技术的进步&#xff0c;现在有了更多更方便的解决方案&…