SpringBoot集成Ollama本地模型
目录
- 项目准备
- 创建Ollama服务客户端
- 创建控制器
- 配置应用属性
- 创建前端界面
- 添加静态资源支持
- 完整项目结构
- 启动应用
- 高级功能扩展
- 部署注意事项
- 性能优化
1. 项目准备
- 创建一个SpringBoot项目,可以使用Spring Initializr或IDE创建
- 添加必要的依赖到
pom.xml
:<dependencies><!-- Spring Boot Starter Web --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><!-- Lombok --><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><optional>true</optional></dependency><!-- WebClient --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-webflux</artifactId></dependency> </dependencies>
2. 创建Ollama服务客户端
package com.example.ollama.service;import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;
import org.springframework.web.reactive.function.client.WebClient;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;import java.util.List;
import java.util.Map;@Slf4j
@Service
public class OllamaService {private final WebClient webClient;public OllamaService(@Value("${ollama.api.url:http://localhost:11434}") String ollamaApiUrl) {this.webClient = WebClient.builder().baseUrl(ollamaApiUrl).build();}/*** 生成文本*/public Mono<String> generateText(String model, String prompt, Double temperature) {Map<String, Object> requestBody = Map.of("model", model,"prompt", prompt,"temperature", temperature != null ? temperature : 0.7);return webClient.post().uri("/api/generate").bodyValue(requestBody).retrieve().bodyToMono(Map.class).map(response -> (String) response.get("response"));}/*** 聊天对话*/public Mono<String> chat(String model, List<Map<String, String>> messages) {Map<String, Object> requestBody = Map.of("model", model,"messages", messages);return webClient.post().uri("/api/chat").bodyValue(requestBody).retrieve().bodyToMono(Map.class).map(response -> {Map<String, Object> message = (Map<String, Object>) response.get("message");return (String) message.get("content");});}/*** 流式生成文本*/public Flux<String> streamGenerateText(String model, String prompt, Double temperature) {Map<String, Object> requestBody = Map.of("model", model,"prompt", prompt,"temperature", temperature != null ? temperature : 0.7,"stream", true);return webClient.post().uri("/api/generate").bodyValue(requestBody).retrieve().bodyToFlux(Map.class).map(response -> (String) response.get("response"));}
}
3. 创建控制器
package com.example.ollama.controller;import com.example.ollama.service.OllamaService;
import lombok.RequiredArgsConstructor;
import org.springframework.web.bind.annotation.*;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;import java.util.List;
import java.util.Map;@RestController
@RequestMapping("/api/ollama")
@RequiredArgsConstructor
public class OllamaController {private final OllamaService ollamaService;@PostMapping("/generate")public Mono<String> generateText(@RequestParam(defaultValue = "llama2") String model,@RequestParam String prompt,@RequestParam(required = false) Double temperature) {return ollamaService.generateText(model, prompt, temperature);}@PostMapping("/chat")public Mono<String> chat(@RequestParam(defaultValue = "llama2") String model,@RequestBody List<Map<String, String>> messages) {return ollamaService.chat(model, messages);}@PostMapping("/stream")public Flux<String> streamGenerateText(@RequestParam(defaultValue = "llama2") String model,@RequestParam String prompt,@RequestParam(required = false) Double temperature) {return ollamaService.streamGenerateText(model, prompt, temperature);}
}
4. 配置应用属性
在application.properties
或application.yml
中添加配置:
# Ollama API配置
ollama.api.url=http://localhost:11434# 服务器配置
server.port=8080
5. 创建前端界面
创建一个简单的HTML页面用于与Ollama交互:
<!DOCTYPE html>
<html lang="zh-CN">
<head><meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><title>Ollama聊天界面</title><style>body {font-family: Arial, sans-serif;max-width: 800px;margin: 0 auto;padding: 20px;}.chat-container {border: 1px solid #ccc;border-radius: 5px;padding: 20px;margin-bottom: 20px;height: 400px;overflow-y: auto;}.message {margin-bottom: 10px;padding: 10px;border-radius: 5px;}.user-message {background-color: #e6f7ff;margin-left: 20%;}.ai-message {background-color: #f0f0f0;margin-right: 20%;}.input-container {display: flex;}#message-input {flex-grow: 1;padding: 10px;border: 1px solid #ccc;border-radius: 5px;margin-right: 10px;}button {padding: 10px 20px;background-color: #4CAF50;color: white;border: none;border-radius: 5px;cursor: pointer;}button:hover {background-color: #45a049;}</style>
</head>
<body><h1>Ollama聊天界面</h1><div class="chat-container" id="chat-container"></div><div class="input-container"><input type="text" id="message-input" placeholder="输入消息..."><button onclick="sendMessage()">发送</button></div><script>const chatContainer = document.getElementById('chat-container');const messageInput = document.getElementById('message-input');// 按Enter键发送消息messageInput.addEventListener('keypress', function(e) {if (e.key === 'Enter') {sendMessage();}});function sendMessage() {const message = messageInput.value.trim();if (!message) return;// 添加用户消息到聊天界面addMessage(message, 'user');messageInput.value = '';// 发送消息到后端fetch('/api/ollama/chat', {method: 'POST',headers: {'Content-Type': 'application/json'},body: JSON.stringify([{'role': 'user','content': message}])}).then(response => response.text()).then(response => {// 添加AI回复到聊天界面addMessage(response, 'ai');}).catch(error => {console.error('Error:', error);addMessage('发生错误,请稍后重试。', 'ai');});}function addMessage(message, sender) {const messageDiv = document.createElement('div');messageDiv.classList.add('message');messageDiv.classList.add(sender === 'user' ? 'user-message' : 'ai-message');messageDiv.textContent = message;chatContainer.appendChild(messageDiv);chatContainer.scrollTop = chatContainer.scrollHeight;}</script>
</body>
</html>
6. 添加静态资源支持
在SpringBoot应用中添加静态资源支持:
package com.example.ollama.config;import org.springframework.context.annotation.Configuration;
import org.springframework.web.servlet.config.annotation.ResourceHandlerRegistry;
import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;@Configuration
public class WebConfig implements WebMvcConfigurer {@Overridepublic void addResourceHandlers(ResourceHandlerRegistry registry) {registry.addResourceHandler("/**").addResourceLocations("classpath:/static/");}
}
7. 完整项目结构
src/
├── main/
│ ├── java/
│ │ └── com/
│ │ └── example/
│ │ └── ollama/
│ │ ├── OllamaApplication.java
│ │ ├── controller/
│ │ │ └── OllamaController.java
│ │ ├── service/
│ │ │ └── OllamaService.java
│ │ └── config/
│ │ └── WebConfig.java
│ └── resources/
│ ├── static/
│ │ └── index.html
│ └── application.properties
8. 启动应用
- 确保Ollama服务已启动并运行在默认端口(11434)
- 运行SpringBoot应用
- 访问
http://localhost:8080
查看聊天界面
9. 高级功能扩展
9.1 添加模型选择功能
@GetMapping("/models")
public Mono<List<String>> listModels() {return webClient.get().uri("/api/tags").retrieve().bodyToMono(Map.class).map(response -> {List<Map<String, Object>> models = (List<Map<String, Object>>) response.get("models");return models.stream().map(model -> (String) model.get("name")).collect(Collectors.toList());});
}
9.2 添加流式响应支持
@GetMapping(value = "/stream-chat", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<String>> streamChat(@RequestParam(defaultValue = "llama2") String model,@RequestParam String message) {return ollamaService.streamGenerateText(model, message, 0.7).map(response -> ServerSentEvent.<String>builder().data(response).build());
}
9.3 添加会话管理
@Service
public class ChatSessionService {private final Map<String, List<Map<String, String>>> sessions = new ConcurrentHashMap<>();public List<Map<String, String>> getOrCreateSession(String sessionId) {return sessions.computeIfAbsent(sessionId, k -> new ArrayList<>());}public void addMessage(String sessionId, String role, String content) {List<Map<String, String>> messages = getOrCreateSession(sessionId);messages.add(Map.of("role", role, "content", content));}public void clearSession(String sessionId) {sessions.remove(sessionId);}
}
10. 部署注意事项
- 确保服务器上已安装并运行Ollama服务
- 配置适当的防火墙规则,允许SpringBoot应用访问Ollama服务
- 在生产环境中使用HTTPS保护API通信
- 考虑添加身份验证和授权机制
- 监控Ollama服务的资源使用情况,避免过载
11. 性能优化
- 使用连接池管理WebClient连接
- 实现请求缓存,避免重复请求
- 使用异步处理提高并发能力
- 考虑使用响应式编程模式处理流式响应