laitimes

ChatUI:使用 Gradio.NET 为LLamaWorker快速创建大模型演示界面

author:opendotnet
Gradio.NET is a .NET port of Gradio. It's a library that enables you to quickly build machine learning model demos, providing a simple API that allows you to create an interactive interface with just a few lines of code. In this article, we'll show you how to quickly and easily create a large model presentation interface for LLamaWorker with the help of Gradio.NET.

1. Background

In a previous article, we introduced the LLamaWorker[1] project, a large language model service designed for .NET developers. LLamaWorker provides an API similar to OpenAI, supporting features such as multi-model switching, streaming response, embedding support, and more. In addition, LLamaWorker provides a Gradio.NET[2]-based UI demo that allows developers to experience and debug models faster.

2. Introduction to Gradio.NET

Gradio.NET is a .NET port of Gradio. Gradio is an open-source Python package that allows you to quickly build demos or web applications for machine learning models, APIs, or any arbitrary Python function, without JavaScript, CSS experience. With Gradio, you can quickly create a user-friendly interface based on machine learning models or data science workflows, allowing users to do things like drag and drop images, paste text, record sound, and interact with demos through a browser.

3. Why choose Gradio.NET

LLamaWorker is an API service that provides a swagger page, but the swagger page does not provide a visual representation of the model. I've been thinking about providing an intuitive demo interface that allows users to understand the effect of the model more quickly, and also helps developers debug the model faster.

Of course, choosing a technology framework is a critical decision. At first, I considered building from scratch with Vue3, but it took a lot of time and effort. It was around this time that I discovered the community's new open source project, Gradio.NET.

I tried it out with the mindset of learning new technologies, and I also wanted to test the new framework for developers, identify issues, and suggest improvements. For people who are new to Gradio, like me, it can be a bit overwhelming at first. However, if you're familiar with Python's Gradio, it's easy to use Gradio.NET.

It's important to note that Gradio.NET is still being worked on, and there are still many libraries that have not yet been migrated. But I believe that as long as everyone works together and actively participates in the construction, we will be able to make Gradio.NET more complete and stronger.

4. Create a demo interface for LLamaWorker

Next, we'll create a simple demo interface for LLamaWorker. The overall code contains no more than 300 lines of comments, but it enables an interactive interface. In this interface, we can enter the text and click the "Generate" button to get the model's response.

In the ChatUI project, we used more than Gradio.NET components and related features, and found and submitted multiple issues to Gradio.NET during the process. For students studying Gradio.NET, this real-world use case will be very helpful. Especially refreshing

Dropdown

, network requests, and the processing of streaming responses, etc.

4.1. Service Settings

LLamaWorker provides support for API Keys and provides interfaces for obtaining model configuration information, which we will use in the ChatUI project to obtain model configuration information.

At the top of the page, we set up an input box to enter the URL of the LLamaWorker service, an input box to enter the API Key, a button to get the model configuration information, and a drop-down box to select the model.

```cs              gr.Markdown("# LLamaWorker");              Textbox input,token;              Dropdown model;              Button btnset;                  using (gr.Row())              {              input = gr.Textbox("http://localhost:5000", placeholder: "LLamaWorker Server URL", label: "Server");              token = gr.Textbox(placeholder: "API Key", label: "API Key", maxLines:1, type:TextboxType.Password);              btnset = gr.Button("Get Models", variant: ButtonVariant.Primary);              model = gr.Dropdown(choices: [], label: "Model Select", allowCustomValue:true);              }           

In the code above, we set up an input box for entering the API Key, and surprising set it as a password input box

TextboxType.Password

to hide what you entered. Here

Dropdown

We don't have a setting option for the component, and we allow it to get the user's custom values

allowCustomValue:true

It is convenient for users to enter a custom model name, and at the same time, the ChatUI project can also be used to call other services, such as the large model service of Ali Lingji.

ChatUI:使用 Gradio.NET 为LLamaWorker快速创建大模型演示界面

Service settings

The image above shows the interface on mobile, and Gradio.NET automatically handles the streaming layout so that the interface can be displayed properly on different devices.

After setting up the basic interface, we need to add a click event to the button in order to get the model configuration information. In Gradio.NET, you can pass

Button

target

Click

event.

btnset?.Click(update_models, inputs: [input, token], outputs: [model]);           

After clicking the button, it will be called

update_models

method, which sends a request to the LLamaWorker service to get model configuration information and updates the options for the drop-down box.

static async Task<Output> update_models(Input input)              {              string server = Textbox.Payload(input.Data[0]);              string token = Textbox.Payload(input.Data[1]);              if (server == "")              {              throw new Exception("Server URL cannot be empty.");              }              if (!string.IsOrWhiteSpace(token))              {              Utils.client.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", token);              }              var res = await Utils.client.GetFromJsonAsync<ConfigModels>(server + "/models/config");              if (res?.Models ==  || res.Models.Count==0)              {              throw new Exception("Failed to fetch models from the server.");              }              Utils.config = res;              var models = res.Models.Select(x => x.Name).ToList();              return gr.Output(gr.Dropdown(choices: models,value: models[res.Current], interactive: true));              }           

at

update_models

In this method, we first get the input service URL and API key, and then send a request to the service to get the model configuration information. If the request is successful, we will update the options in the drop-down box. In this process, we also set the default value of the drop-down box based on the current model returned by the service.

Here the network request was used

Utils

class

HttpClient

in order to share one across the project

HttpClient

Instance.

HttpClient

Instances are designed to be reused by multiple requests, which helps reduce resource consumption and improve the performance of your application.

4.2. Model Switching for Dropdown Components

After obtaining the model configuration information, we need to add a click event to the options in the drop-down box to switch the model. In Gradio.NET, you can pass

Dropdown

target

Change

event.

model?.Change(change_models, inputs: [input, model], outputs: [model]);           

It is called after clicking on the drop-down box option

change_models

method, which sends a request to the LLamaWorker service to switch models.

static async Task<Output> change_models(Input input)              {              string server = Textbox.Payload(input.Data[0]);              string model = Dropdown.Payload(input.Data[1]).Single();                  var models = Utils.config?.Models?.Select(x => x.Name).ToList();              // 未使用服务端模型配置,允许自定义模型              if (models == )              {              return gr.Output(gr.Dropdown(choices: [model], value: model, interactive: true, allowCustomValue: true));              }              if (server == "")              {              throw new Exception("Server URL cannot be empty.");              }                  // 取得模型是第几个              var index = models.IndexOf(model);              if (index == -1)              {              throw new Exception("Model not found in the list of available models.");              }              if (Utils.config.Current == index)              {              // 没有切换模型              return gr.Output(gr.Dropdown(choices: models, value: model, interactive: true));              }              var res = await Utils.client.PutAsync($"{server}/models/{index}/switch", );              // 请求失败              if (!res.IsSuccessStatusCode)              {              // 错误信息未返回              gr.Warning("Failed to switch model.");              await Task.Delay(2000);              return gr.Output(gr.Dropdown(choices: models, value: models[Utils.config.Current], interactive: true));              }              Utils.config.Current = index;              return gr.Output(gr.Dropdown(choices: models, value: model, interactive: true));              }           

at

change_models

In this method, we first get the model configuration information, then get the input service URL and model name, and send a request to the service to switch the model. If the request is successful, we will update the options in the drop-down box. At the same time, in the absence of a server-side model configuration, we allow users to customize the model.

It is important to note here that in the event of a failed switch, we will display a warning message and restore the option of the drop-down box after 2 seconds. However, the option to restore the drop-down box is called repeatedly

Change

event, which will cause

Warning

The tooltip box is not displayed, so it needs to be used

Warning

The option to delay the drop-down box for 2 seconds after the prompt box is displayed is not a big problem to recall repeatedly.

4.3. Model Interaction

After setting up the service and model switching, we add a Tab component that shows the different capabilities of the model for dialogue and text generation.

using (gr.Tab("Chat"))              {              // Chat 交互界面组件              }              using (gr.Tab("Completion"))              {              // Completion 交互界面组件              }           

In the Chat interface, we can use it directly

Chatbot

component, which displays a list of conversation messages, adds an input box for entering text, and provides three buttons for sending text, regenerating, and emptying the conversation.

Chatbot chatBot = gr.Chatbot(label: "LLamaWorker Chat", showCopyButton: true, placeholder: "Chat history",height:520);              Textbox userInput = gr.Textbox(label: "Input", placeholder: "Type a message...");                  Button sendButton, resetButton, regenerateButton;                  using (gr.Row())              {              sendButton = gr.Button("✉️ Send", variant: ButtonVariant.Primary);              regenerateButton = gr.Button("🔃 Retry", variant: ButtonVariant.Secondary);              resetButton = gr.Button("🗑️ Clear", variant: ButtonVariant.Stop);              }           

Next, let's add a three-button click event to send text, regenerate, and clear the conversation.

sendButton?.Click(streamingFn: i =>              {              string server = Textbox.Payload(i.Data[0]);              string token = Textbox.Payload(i.Data[3]);              string model = Dropdown.Payload(i.Data[4]).Single();              IList<ChatbotMessagePair> chatHistory = Chatbot.Payload(i.Data[1]);              string userInput = Textbox.Payload(i.Data[2]);              return ProcessChatMessages(server, token, model, chatHistory, userInput);              }, inputs: [input, chatBot, userInput, token, model], outputs: [userInput, chatBot]);              regenerateButton?.Click(streamingFn: i =>              {              string server = Textbox.Payload(i.Data[0]);              string token = Textbox.Payload(i.Data[2]);              string model = Dropdown.Payload(i.Data[3]).Single();              IList<ChatbotMessagePair> chatHistory = Chatbot.Payload(i.Data[1]);              if (chatHistory.Count == 0)              {              throw new Exception("No chat history available for regeneration.");              }              string userInput = chatHistory[^1].HumanMessage.TextMessage;              chatHistory.RemoveAt(chatHistory.Count - 1);              return ProcessChatMessages(server, token, model, chatHistory, userInput);              }, inputs: [input, chatBot, token, model], outputs: [userInput, chatBot]);              resetButton?.Click(i => Task.FromResult(gr.Output(Array.Empty<ChatbotMessagePair>(), "")), outputs: [chatBot, userInput]);           

After clicking the button, it will be called

ProcessChatMessages

method, which sends a request to the LLamaWorker service, gets a reply from the model, and updates the list of dialog messages.

static async IAsyncEnumerable<Output> ProcessChatMessages(string server, string token, string model, IList<ChatbotMessagePair> chatHistory, string message)              {              if (message == "")              {              yield return gr.Output("", chatHistory);              yield break;              }                  // 添加用户输入到历史记录              chatHistory.Add(new ChatbotMessagePair(message, ""));                  // sse 请求              var request = new HttpRequestMessage(HttpMethod.Post, $"{server}/v1/chat/completions");              request.Headers.Accept.Add(new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("text/event-stream"));              if (!string.IsOrWhiteSpace(token))              {              Utils.client.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", token);              }                  var messages =new List<ChatCompletionMessage>();              foreach (var item in chatHistory)              {              messages.Add(new ChatCompletionMessage              {              role = "user",              content = item.HumanMessage.TextMessage              });              messages.Add(new ChatCompletionMessage              {              role = "assistant",              content = item.AiMessage.TextMessage              });              }              messages.Add(new ChatCompletionMessage              {              role = "user",              content = message              });                      request.Content = new StringContent(JsonSerializer.Serialize(new ChatCompletionRequest              {              stream = true,              messages = messages.ToArray(),              model = model,              max_tokens = 1024,              temperature = 0.9f,              top_p = 0.9f,              }), Encoding.UTF8, "application/json");                  using var response = await Utils.client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);              response.EnsureSuccessStatusCode();              using (var stream = await response.Content.ReadAsStreamAsync())              using (var reader = new System.IO.StreamReader(stream))              {              while (!reader.EndOfStream)              {              var line = await reader.ReadLineAsync();              if (line.StartsWith("data:"))              {              var data = line.Substring(5).Trim();                  // 结束              if(data == "[DONE]")              {              yield break;              }                  // 解析返回的数据              var completionResponse = JsonSerializer.Deserialize<ChatCompletionChunkResponse>(data);              var text = completionResponse?.choices[0]?.delta?.content;              if (string.IsOrEmpty(text))              {              continue;              }              chatHistory[^1].AiMessage.TextMessage += text;              yield return gr.Output("", chatHistory);              }              }              }              }           

at

ProcessChatMessages

In this method, we first get the input service URL, API Key, dialog message list, and text, and then send a request to the service to get a response from the model. In this process, we use SSE requests in order to implement a streaming response. Once we get a response from the model, we will update the list of conversation messages.

For the text generation interface, we can use it directly

Textbox

component to enter text and add a button to generate text. The related event handling and flow is similar to the Chat interface, so I won't go into detail here. The full code can be viewed in the ChatUI of the LLamaWorker[2] project.

5. Effects

After running the LLamaWorker service, we can enter the service URL and API Key (if configured) in the ChatUI project, and then click the "Get Models" button to get the model configuration information. Next, we can select the model, enter the text in the Chat interface, and click the "Send" button to get the model's response.

Of course, you can also choose other services, such as Ali Lingji's large model service, and only need to modify the service URL:

https://dashscope.aliyuncs.com/compatible-mode

and API Key, you can manually enter the model you want to experience, such as "qwen-long", to experience the large model service of Ali Lingji.

ChatUI:使用 Gradio.NET 为LLamaWorker快速创建大模型演示界面

qwen-long

6. Summary

In this article, we explain how to use Gradio.NET to quickly and easily create a large model demo interface for LLamaWorker. With Gradio.NET, we can quickly build an interactive interface to help developers understand and experience the effects of the model faster. We also show how to use the multiple components and related features of Gradio.NET, as well as how to handle network requests and streaming responses. Hopefully, this real-world use case will help you learn and use Gradio.NET better.

References

[1]

LLamaWorker: https://github.com/sangyuxiaowu/LLamaWorker?wt.mc_id=DT-MVP-5005195

[2]

Gradio.NET: https://github.com/feiyun0112/Gradio.Net?wt.mc_id=DT-MVP-5005195

Read on