结合SK和ChatGLM3B+whisper+Avalonia实现语音切换城市
先创建一个Avalonia的MVVM项目模板,项目名称GisApp
项目创建完成以后添加以下nuget依赖
<PackageReference Include="Mapsui.Avalonia" Version="4.1.1" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection" Version="8.0.0" />
<PackageReference Include="Microsoft.Extensions.Http" Version="8.0.0" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.0.0-beta8" />
<PackageReference Include="NAudio" Version="2.2.1" />
<PackageReference Include="Whisper.net" Version="1.5.0" />
<PackageReference Include="Whisper.net.Runtime" Version="1.5.0" />
Mapsui.Avalonia
是Avalonia的一个Gis地图组件Microsoft.Extensions.DependencyInjection
用于构建一个DI容器Microsoft.Extensions.Http
用于注册一个HttpClient工厂Microsoft.SemanticKernel
则是SK用于构建AI插件NAudio
是一个用于录制语音的工具包Whisper.net
是一个.NET的Whisper
封装Whisper
用的是OpenAI开源的语音识别模型Whisper.net.Runtime
属于Whisper
修改App.cs
打开App.cs
,修改成以下代码
public partial class App : Application
{public override void Initialize(){AvaloniaXamlLoader.Load(this);}public override void OnFrameworkInitializationCompleted(){if (ApplicationLifetime is IClassicDesktopStyleApplicationLifetime desktop){var services = new ServiceCollection();services.AddSingleton<MainWindow>((services) => new MainWindow(services.GetRequiredService<IKernel>(), services.GetRequiredService<WhisperProcessor>()){DataContext = new MainWindowViewModel(),});services.AddHttpClient();var openAIHttpClientHandler = new OpenAIHttpClientHandler();var httpClient = new HttpClient(openAIHttpClientHandler);services.AddTransient<IKernel>((serviceProvider) =>{return new KernelBuilder().WithOpenAIChatCompletionService("gpt-3.5-turbo-16k", "fastgpt-zE0ub2ZxvPMwtd6XYgDX8jyn5ubiC",httpClient: httpClient).Build();});services.AddSingleton(() =>{var ggmlType = GgmlType.Base;// 定义使用模型var modelFileName = "ggml-base.bin";return WhisperFactory.FromPath(modelFileName).CreateBuilder().WithLanguage("auto") // auto则是自动识别语言.Build();});var serviceProvider = services.BuildServiceProvider();desktop.MainWindow = serviceProvider.GetRequiredService<MainWindow>();}base.OnFrameworkInitializationCompleted();}
}
OpenAIHttpClientHandler.cs
,这个文件是用于修改SK的访问地址,默认的SK只支持OpenAI官方的地址并且不能进行修改!
public class OpenAIHttpClientHandler : HttpClientHandler
{protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken){if (request.RequestUri.LocalPath == "/v1/chat/completions"){var uriBuilder = new UriBuilder("http://您的ChatGLM3B地址/api/v1/chat/completions");request.RequestUri = uriBuilder.Uri;}return base.SendAsync(request, cancellationToken);}
}
修改ViewModels/MainWindowViewModel.cs
public class MainWindowViewModel : ViewModelBase
{private string subtitle = string.Empty;public string Subtitle{get => subtitle;set => this.RaiseAndSetIfChanged(ref subtitle, value);}private Bitmap butBackground;public Bitmap ButBackground{get => butBackground;set => this.RaiseAndSetIfChanged(ref butBackground, value);}
}
ButBackground
是显示麦克风图标的写到模型是为了切换图标Subtitle
用于显示识别的文字
添加SK插件
创建文件/plugins/MapPlugin/AcquireLatitudeLongitude/config.json
:这个是插件的相关配置信息
{"schema": 1,"type": "completion","description": "获取坐标","completion": {"max_tokens": 1000,"temperature": 0.3,"top_p": 0.0,"presence_penalty": 0.0,"frequency_penalty": 0.0},"input": {"parameters": [{"name": "input","description": "获取坐标","defaultValue": ""}]}
}
创建文件/plugins/MapPlugin/AcquireLatitudeLongitude/skprompt.txt
:下面是插件的prompt
,通过以下内容可以提取用户城市然后得到城市的经纬度
请返回{{$input}}的经纬度然后返回以下格式,不要回复只需要下面这个格式:
{"latitude":"","longitude":""
}
修改Views/MainWindow.axaml
代码,将[素材](# 素材)添加到Assets
中,
<Window xmlns="https://github.com/avaloniaui"xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"xmlns:vm="using:GisApp.ViewModels"xmlns:d="http://schemas.microsoft.com/expression/blend/2008"xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"mc:Ignorable="d" d:DesignWidth="800" d:DesignHeight="450"x:Class="GisApp.Views.MainWindow"x:DataType="vm:MainWindowViewModel"Icon="/Assets/avalonia-logo.ico"Width="800"Height="800"Title="GisApp"><Design.DataContext><vm:MainWindowViewModel /></Design.DataContext><Grid><Grid Name="MapStackPanel"></Grid><StackPanel HorizontalAlignment="Right" VerticalAlignment="Bottom" Background="Transparent" Margin="25"><TextBlock Foreground="Black" Text="{Binding Subtitle}" Width="80" TextWrapping="WrapWithOverflow" Padding="8"></TextBlock><Button Width="60" Click="Button_OnClick" Background="Transparent" VerticalAlignment="Center" HorizontalAlignment="Center"><Image Name="ButBackground" Source="{Binding ButBackground}" Height="40" Width="40"></Image></Button></StackPanel></Grid>
</Window>
修改Views/MainWindow.axaml.cs
代码
public partial class MainWindow : Window
{private bool openVoice = false;private WaveInEvent waveIn;private readonly IKernel _kernel;private readonly WhisperProcessor _processor;private readonly Channel<string> _channel = Channel.CreateUnbounded<string>();private MapControl mapControl;public MainWindow(IKernel kernel, WhisperProcessor processor){_kernel = kernel;_processor = processor;InitializeComponent();mapControl = new MapControl();// 默认定位到深圳mapControl.Map = new Map(){CRS = "EPSG:3857",Home = n =>{var centerOfLondonOntario = new MPoint(114.06667, 22.61667);var sphericalMercatorCoordinate = SphericalMercator.FromLonLat(centerOfLondonOntario.X, centerOfLondonOntario.Y).ToMPoint();n.ZoomToLevel(15);n.CenterOnAndZoomTo(sphericalMercatorCoordinate, n.Resolutions[15]);}};mapControl.Map?.Layers.Add(Mapsui.Tiling.OpenStreetMap.CreateTileLayer());MapStackPanel.Children.Add(mapControl);DataContextChanged += (sender, args) =>{using var voice = AssetLoader.Open(new Uri("avares://GisApp/Assets/voice.png"));ViewModel.ButBackground = new Avalonia.Media.Imaging.Bitmap(voice);};Task.Factory.StartNew(ReadMessage);}private MainWindowViewModel ViewModel => (MainWindowViewModel)DataContext;private void Button_OnClick(object? sender, RoutedEventArgs e){if (openVoice){using var voice = AssetLoader.Open(new Uri("avares://GisApp/Assets/voice.png"));ViewModel.ButBackground = new Avalonia.Media.Imaging.Bitmap(voice);waveIn.StopRecording();}else{using var voice = AssetLoader.Open(new Uri("avares://GisApp/Assets/open-voice.png"));ViewModel.ButBackground = new Avalonia.Media.Imaging.Bitmap(voice);// 获取当前麦克风设备waveIn = new WaveInEvent();waveIn.DeviceNumber = 0; // 选择麦克风设备,0通常是默认设备WaveFileWriter writer = new WaveFileWriter("recorded.wav", waveIn.WaveFormat);// 设置数据接收事件waveIn.DataAvailable += (sender, a) =>{Console.WriteLine($"接收到音频数据: {a.BytesRecorded} 字节");writer.Write(a.Buffer, 0, a.BytesRecorded);if (writer.Position > waveIn.WaveFormat.AverageBytesPerSecond * 30){waveIn.StopRecording();}};// 录音结束事件waveIn.RecordingStopped += async (sender, e) =>{writer?.Dispose();writer = null;waveIn.Dispose();await using var fileStream = File.OpenRead("recorded.wav");using var wavStream = new MemoryStream();await using var reader = new WaveFileReader(fileStream);var resampler = new WdlResamplingSampleProvider(reader.ToSampleProvider(), 16000);WaveFileWriter.WriteWavFileToStream(wavStream, resampler.ToWaveProvider16());wavStream.Seek(0, SeekOrigin.Begin);await Dispatcher.UIThread.InvokeAsync(() => { ViewModel.Subtitle = string.Empty; });string text = string.Empty;await foreach (var result in _processor.ProcessAsync(wavStream)){await Dispatcher.UIThread.InvokeAsync(() => { ViewModel.Subtitle += text += result.Text; });}_channel.Writer.TryWrite(text);};Console.WriteLine("开始录音...");waveIn.StartRecording();}openVoice = !openVoice;}private async Task ReadMessage(){try{var pluginsDirectory = Path.Combine(Directory.GetCurrentDirectory(), "plugins");var chatPlugin = _kernel.ImportSemanticFunctionsFromDirectory(pluginsDirectory, "MapPlugin");// 循环读取管道中的数据while (await _channel.Reader.WaitToReadAsync()){// 读取管道中的数据while (_channel.Reader.TryRead(out var message)){// 使用AcquireLatitudeLongitude插件,解析用户输入的地点,然后得到地点的经纬度var value = await _kernel.RunAsync(new ContextVariables{["input"] = message}, chatPlugin["AcquireLatitudeLongitude"]);// 解析字符串成模型var acquireLatitudeLongitude =JsonSerializer.Deserialize<AcquireLatitudeLongitude>(value.ToString());// 使用MapPlugin插件,定位到用户输入的地点var centerOfLondonOntario = new MPoint(acquireLatitudeLongitude.longitude, acquireLatitudeLongitude.latitude);var sphericalMercatorCoordinate = SphericalMercator.FromLonLat(centerOfLondonOntario.X, centerOfLondonOntario.Y).ToMPoint();// 默认使用15级缩放mapControl.Map.Navigator.ZoomToLevel(15);mapControl.Map.Navigator.CenterOnAndZoomTo(sphericalMercatorCoordinate, mapControl.Map.Navigator.Resolutions[15]);}}}catch (Exception e){Console.WriteLine(e);}}public class AcquireLatitudeLongitude{public double latitude { get; set; }public double longitude { get; set; }}
}
流程讲解:
- 用户点击了录制按钮触发了
Button_OnClick
事件,然后在Button_OnClick
事件中会打开用户的麦克风,打开麦克风进行录制,在录制结束事件中使用录制完成产生的wav
文件,然后拿到Whisper
进行识别,识别完成以后会将识别结果写入到_channel
ReadMessage
则是一直监听_channel
的数据,当有数据写入,这里则会读取到,然后就将数据使用下面的sk执行AcquireLatitudeLongitude
函数。
var value = await _kernel.RunAsync(new ContextVariables{["input"] = message}, chatPlugin["AcquireLatitudeLongitude"]);
- 在解析
value
得到用户的城市经纬度 - 通过
mapControl.Map.Navigator
修改到指定经纬度。
完整的操作流程就完成了,当然实际业务会比这个更复杂。
素材
分享总结
讨论总结:
在本次会议中,讨论了如何结合SK、ChatGLM3B、Whisper和Avalonia来实现语音切换城市的功能。具体讨论了创建Avalonia的MVVM项目模板,添加了相关的NuGet依赖,修改了App.cs、ViewModels/MainWindowViewModel.cs以及添加了SK插件的相关配置和文件。
行动项目:
- 创建Avalonia的MVVM项目模板,项目名称为
GisApp
。 - 添加所需的NuGet依赖,包括
Mapsui.Avalonia
,Microsoft.Extensions.DependencyInjection
,Microsoft.Extensions.Http
,Microsoft.SemanticKernel
,NAudio
,Whisper.net
和Whisper.net.Runtime
。 - 修改
App.cs
,OpenAIHttpClientHandler.cs
,ViewModels/MainWindowViewModel.cs
以及相关的视图文件。 - 添加SK插件,包括创建相关的配置信息和
prompt
文件。 - 实现录制语音、语音识别和切换城市的功能流程。
技术交流群:737776595