laitimes

Java developer LLM in practice - using LangChain4j to build a native RAG system

author:JD Cloud developer

1. Introduction

Since the current popular chatGPT is a pre-trained model, and it takes a long time to train a large model (the more parameters, the longer the learning time, the conservative estimate is generally a few months, and those who are not bad can use more GPUs to shorten this time), which leads to the fact that the knowledge it learns will not be the latest, and the latest chatGPT-4o can only be answered based on the data before June 2023, which is almost a year away from now, if you want GPT to reply to questions based on nearly a year's time, That's where RAG (Retrieval Enhanced Generation) technology comes in.

Java developer LLM in practice - using LangChain4j to build a native RAG system



In addition, for the company's internal private data, for the sake of data security and commercial interests, it cannot be put on the Internet, so GPT does not have this part of the knowledge, and if GPT needs to answer based on this part of the private knowledge, it also needs to use RAG technology.

Java developer LLM in practice - using LangChain4j to build a native RAG system



This article will use practical code examples, which is intended to help Java engineers who have no practical experience in large models master the use of the LangChain4j framework for large model development.

2. Basic concepts

2.1 What is RAG?

The core idea of RAG (Retrieval-Augmented Generation) is to combine traditional information retrieval (IR) technology with modern generative large models (such as chatGPT).

Specifically, the RAG model first retrieves several relevant document fragments from a large document library or knowledge base before generating an answer. These retrieved fragments are then fed into the generative model as additional contextual information, resulting in more accurate and informative text.

The working principle of RAG can be broken down into the following steps:

1. Receive a request: First, the system receives a request from the user (e.g. to ask a question).

2. Information Retrieval (R): The system retrieves the most relevant document fragments from a large document library. The goal of this step is to find those documents that may contain answers or relevant information.

3. Generate enhancement (A): Input the retrieved document fragment into the large model (such as chatGPT) together with the original query, pay attention to the use of appropriate prompt words, such as the original question is XXX, the retrieved information is YYY, and the input to the large model should be similar to: Please answer XXXX based on YYY.

4. Output Generation (G): The large model generates the final text answer based on the input query and retrieved document fragments, and returns it to the user.

The information retrieval in step 2 does not necessarily have to use a vector database, which can be a relational database (such as MySQL) or a full-text search engine (such as Elasticsearch, ES).

However, the reason why vector databases are widely used in large model application scenarios is that in the application scenarios of large model RAG, it is mainly necessary to query a few documents with high similarity, rather than accurately finding a certain document (MySQL and ES are good at it).

Two documents with a high degree of similarity may not contain the same keyword. For example, sentence 1: "He was happy." Sentence 2: "He felt very happy." Although they all describe [him] very happy and happy mood, they do not contain the same keywords;

Two documents containing the same keyword may be completely unrelated, for example: sentence 1: "He likes apples." Sentence 2: "Apple is a big company." Although they both contain the same keyword "apple", the similarity between the two sentences is very low.

2.2 LangChain4j简介

LangChain4j是LangChiain的java版本,

LangChain的Lang取自Large Language Model,代表大语言模型,

Chain is a chain execution, that is, the functions in the language model application are modularized and connected in series to form a complete workflow.

It is a development framework for large language models, which aims to encapsulate the details of interconnection with LLMs, simplify the development process, and improve the efficiency of LLM-based development.

For more information, see: https://github.com/langchain4j/langchain4j/blob/main/README.md

2.3 LARGE MODEL DEVELOPMENT VS. TRADITIONAL JAVA DEVELOPMENT

Large model development - large model implementation business logic:

Before development, developers focus on data preparation (training), selection, and fine-tuning of models (to get better results and better match business expectations).

During the development process (most of the time), the focus is on how to effectively communicate with the large model (LLM) and use the LLM's expertise to solve specific business problems.

The development is more focused on how to describe the problem (prompt engineering Propmt Engineering) for effective reasoning, and how to integrate the use of large models into existing business systems.

TRADITIONAL JAVA DEVELOPMENT - DEVELOPERS IMPLEMENT BUSINESS LOGIC:

Before development, developers pay attention to the selection of system architecture (high concurrency, high availability), functional disassembly, modularization and other design.

In the development process (most of the time), specific algorithms, data storage, etc. are designed to implement business logic based on specific business problems, and coding is the mainstay.

3. Actual combat experience

3.1 Environment Setup

3.1.1 向量库(Chroma)

Windows:

先安装python,参考: https://docs.python.org/zh-cn/3/using/windows.html#the-full-installer

PS: Note that you need to configure environment variables

Validation-Execution:

python --version           
Java developer LLM in practice - using LangChain4j to build a native RAG system



Re-dressing chroma, Reference: https://docs.trychroma.com/getting-started

Validation-Execution:

chroma run           
Java developer LLM in practice - using LangChain4j to build a native RAG system



Mac:

Now install python first

brew install python           

Or download and install: https://www.python.org/downloads/macos/

Validation-Execution:

python --version           
Java developer LLM in practice - using LangChain4j to build a native RAG system



Ansou chroma (same as above), reference: https://docs.trychroma.com/getting-started

Validation-Execution:

chroma run           
Java developer LLM in practice - using LangChain4j to build a native RAG system



3.1.2 集成LangChain4j

<properties>
        <langchain4j.version>0.31.0</langchain4j.version>
</properties>
<dependency>
	<groupId>dev.langchain4j</groupId>
	<artifactId>langchain4j-core</artifactId>
	<version>${langchain4j.version}</version>
</dependency>
<dependency>
	<groupId>dev.langchain4j</groupId>
	<artifactId>langchain4j</artifactId>
	 <version>${langchain4j.version}</version> 
</dependency>
<dependency>
	<groupId>dev.langchain4j</groupId>
	<artifactId>langchain4j-open-ai</artifactId>
	 <version>${langchain4j.version}</version> 
</dependency>
<dependency>
	<groupId>dev.langchain4j</groupId>
	<artifactId>langchain4j-embeddings</artifactId>
	 <version>${langchain4j.version}</version> 
</dependency>
<dependency>
	<groupId>dev.langchain4j</groupId>
	<artifactId>langchain4j-chroma</artifactId>
	 <version>${langchain4j.version}</version> 
</dependency>
<dependency>
	<groupId>io.github.amikos-tech</groupId>
	<artifactId>chromadb-java-client</artifactId>
	 <version>${langchain4j.version}</version> 
</dependency>           

3.2 Programming

3.2.1 Project Structure

LangChain ├── core │ ├── src │ │ ├── hand │ │ │ ├── java │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ cn.jdl.tech_and_data│ �ka │ │ │ │ ├── ChatWithMemory │ │ │ │ ├── Constants │ │ │ │ │ ├── Hand │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │�� └── Utils │ │ │ ├── resources │ │ │ │ ├── log4j2.xml │ │ │ │ │ └── 笑话.txt │ │ ├── test │ │ │ └── Java │ ├── Target ├── pom.xml ├── Parent [learn.langchain.parent] ├── pom.xml

3.2.2 Knowledge Gathering

Generally, it is a text file, WORD document or PDF file obtained from the knowledge base of the company's intranet or on the Internet, and the [joke.txt in the resources directory is used as the result file of knowledge collection

URL docUrl = Main.class.getClassLoader().getResource("笑话.txt");
if(docUrl==null){
    log.error("未获取到文件");
}
Document document = getDocument(docUrl);
if(document==null){
    log.error("加载文件失败");
}           
private static Document getDocument(URL resource) {
    Document document = null;
    try{
        Path path = Paths.get(resource.toURI());
        document = FileSystemDocumentLoader.loadDocument(path);
    }catch (URISyntaxException e){
        log.error("加载文件发生异常", e);
    }
    return document;
}           

3.2.3 Document Sharding

使用dev.langchain4j.data.document.splitter.DocumentSplitters#recursize

It has three parameters: segment size (the maximum number of tokens in a segment), overlap (the number of tokens that overlap before the segment), and tokenizer (tokens that segment a piece of text is segmented to get tokens)

Among them, the overlap is designed to reduce the semantics of the original text after splitting by size, so as to make it as complete as possible.

Java developer LLM in practice - using LangChain4j to build a native RAG system



DocumentSplitter splitter = DocumentSplitters.recursive(150,10,new OpenAiTokenizer());
splitter.split(document);           

About Tokens:

Token is the text unit after word segmentation, that is, the number of words, subwords, etc. obtained after word segmentation of a text, depending on the tokenizer.

For example, if I like to eat apples, it can be split into me/like/eat/apples, the number of tokens = 4, or it can be split into me/happy/happy/eat/apples, the number of tokens = 5

chatGPT使用的是BPE(Byte Pair Encoding)算法进行分词,参见: https://en.wikipedia.org/wiki/Byte_pair_encoding

The tokenization results for the above text are as follows:

18:17:29.371 [main] INFO  TokenizerTest - 待分词的文本:我喜欢吃苹果
18:17:30.055 [main] INFO  cn.jdl.tech_and_data.ka.Utils - 当前的模型是:gpt-4o
18:17:31.933 [main] INFO  TokenizerTest - 分词结果:我 / 喜欢 / 吃 / 苹果           

Regarding the relationship between tokens and characters: GPT-4o's reply:

Java developer LLM in practice - using LangChain4j to build a native RAG system

About the purpose of document splitting:

Since the length of the token corresponding to the input text when interacting with the LLM is limited, if the input content is too long, the LLM will not respond or directly report an error.

Therefore, all relevant knowledge cannot be given to the LLM as input, and the knowledge documents need to be split and stored in the vector library.

Each time the LLM is invoked, the document fragment that is most relevant to the question raised is identified and fed into the LLM as a reference context.

If the input parameter is too long, the LLM will report the following error:

Java developer LLM in practice - using LangChain4j to build a native RAG system

Although according to the response, 1048576 characters = 1024K characters = 1M characters are allowed,

However, the 32K tokens given by the official website document, and generally 1 Chinese character corresponds to 1-2 Tokens, so the string is recommended to be no larger than 64K, in actual use, in order to ensure performance, it is also necessary to control the input not to be too long.

The following are the upper limits of token input given by common LLMs:

The name of the model Token input limit (maximum length)
GPT-3 (davinci) 4096 tokens
GPT-3.5 (text-davinci-003) 4096 tokens
GPT-4 (8k context) 8192 tokens
GPT-4 (32k context) 32768 tokens
LLaMA (7B) 2048 tokens
LLaMA (13B) 2048 tokens
LLaMA (30B) 2048 tokens
LLaMA (65B) 2048 tokens
讯飞星火(SparkDesk) 8192 tokens
文心一言(Ernie 3.0) 4096 tokens
智源悟道(WuDao 2.0) 2048 tokens
Alibaba M6 2048 tokens
华为盘古(Pangu-Alpha) 2048 tokens
言犀大模型(ChatJd) 2048 tokens

There are 6 types of document splitting schemes in Langchain4j:

Java developer LLM in practice - using LangChain4j to build a native RAG system



1. Character-based: Segmentation one by one (including blank characters).

2. Line-based: Split by line break (\n).

3. Paragraph-based: Split by two consecutive line breaks (\n\n).

4. Regex-based: Separated by custom regex

5. Sentence-based (using Apache OpenNLP, only supports English, so it can be ignored)

Java developer LLM in practice - using LangChain4j to build a native RAG system



6. Word-based: Divide the text according to blank characters

The process of document sharding is as follows, where segments is the result of the final output, and the type is: List<TextSegment>.

Start by using a paragraph-based scheme to cut the entire document into segments (segments) :p arts,

Then, according to a certain scheme (such as clauses) for each paragraph part, and under the condition of satisfying the [segment size (the maximum number of tokens contained in a segment)], the overlapping part is calculated at the same time, and the overlap information is supplemented according to the overlap degree (the number of tokens overlapping before the segment).

Java developer LLM in practice - using LangChain4j to build a native RAG system



3.2.4 Text vectorization

Since it is necessary to store the text of the split knowledge fragment in the vector library so that it can be retrieved later, and the data stored in the vector library is a vector and not a text,

Therefore, text needs to be vectorized, that is, a string is converted into an N-dimensional array, a process called Words Embedding in the field of natural language processing (NLP).

The implementation of ChatGPT is based on the transformer architecture, and the relevant implementations are stored on the server side, and each embedding needs to access OpenAI's HTTP interface.

Through the following example, we can see that the model used by OpenAi is: text-embedding-ada-002, and the dimension of the vector is: 1536.

OpenAiEmbeddingModel embeddingModel = new OpenAiEmbeddingModel.OpenAiEmbeddingModelBuilder().apiKey(API_KEY).baseUrl(BASE_URL).build();
log.info("当前的模型是: {}", embeddingModel.modelName());
String text = "两只眼睛";
Embedding embedding = embeddingModel.embed(text).content();
log.info("文本:{}的嵌入结果是:\n{}", text, embedding.vectorAsList());
log.info("它是{}维的向量", embedding.dimension());           
16:33:34.096 [main] INFO  cn.jdl.tech_and_data.ka.OpenAIEmbeddingTest - 当前的模型是: text-embedding-ada-002
16:33:36.201 [main] INFO  cn.jdl.tech_and_data.ka.OpenAIEmbeddingTest - 文本:两只眼睛的嵌入结果是:
[-0.010065761, -0.0021892355, 0.007137785, -0.004445936, -0.018269492, 0.014046189, 0.010794383, -0.021953074, -0.008244209, -0.027377252, 0.0036970759, 0.028308267, 5.81463E-4, 0.0021504434, -0.0068240734, 0.012217891, 0.020495834, 0.0020880383, 0.021278426, -0.024273867, -0.005393818, 0.0036970759, 0.011563482, -0.0073604193, -0.019861663, 0.0036768364, 0.025069952, 0.0060010017, -0.011023763, -0.021089524, 0.025123924, -0.013189386, -0.025677137, -0.011671426, 0.011239651, 0.013816809, -0.008217223, 0.0146938525, 0.016407462, 0.005839086, 0.008790675, 0.015476446, 8.72827E-4, 0.0318974, -0.011212665, 0.0010473924, -0.027566154, 0.005366832, -0.020306932, 0.022479301, -0.0064192843, -0.0074346308, -0.028740043, -0.012055975, -0.015071657, -9.799275E-4, -0.004068133, -0.011900807, -0.013783077, -0.005191423, -4.3978673E-4, 0.0073199403, -0.023261894, 0.0011991884, -0.030683031, 0.0053330995, 0.0060819597, -0.0063821785, -5.401408E-4, -0.011219411, 0.029927425, 0.020050565, 0.005994255, -0.016717799, 0.04725241, -0.013735851, -0.027984437, -0.037645407, -6.0507574E-4, -0.002486081, 0.017365463, -0.031222751, -0.0086287595, 0.01062572, -0.016960673, 6.588368E-5, -0.007083813, 0.029846467, -0.02215547, -0.025312826, 0.009283168, 0.013533456, -0.007562814, 0.0075898, -0.023046006, 0.031708498, 0.015355009, 0.04325849, -0.011597214, -0.017648814, 0.004995775, -0.015004192, -0.031384666, 0.0016655395, -0.014437486, -0.002553546, 0.018755239, -0.002821719, 0.016461432, -0.015328023, -0.016461432, 0.004800127, 0.022816624, -0.026878012, -0.023032513, 8.2686654E-4, 0.025474742, -0.011806356, -0.04123454, -0.011482524, 0.0049249367, 0.010861848, 0.0069084046, -0.015058164, 0.022182455, 0.020603778, -0.019133043, -2.224233E-4, 0.0052892473, 0.0039669354, 0.04010113, -0.0034322762, 0.02782252, 0.016420955, -0.0103626065, 0.02841621, -0.007900138, 0.01867428, -0.019875156, -0.0020593656, 0.0045336406, 0.033246696, -0.019510847, -0.0082509555, 0.0024405424, 0.022600738, 0.004155837, 0.027134378, 0.0050058947, -0.0032400014, 0.014909741, -0.010227677, 0.022317385, 0.0035149206, -0.0036700899, 0.0026716096, 0.004617972, 0.02202054, -0.030197285, 0.0147613175, -0.01835045, 0.0144105, 0.031384666, -0.015894728, 0.02841621, 0.023315866, -0.01200875, 0.008878379, -7.5349846E-4, 0.0050295074, -0.0030477263, 0.013385034, 8.340347E-4, 0.005464656, 0.023639698, 0.021426849, 0.0020559924, -0.0013518278, -0.0071850107, -0.012319089, -0.04951923, -0.016326504, -0.0025973981, -0.0010473924, 0.0013956799, -0.015867742, -0.008149759, -0.0035081743, 0.009924085, 0.0011915986, 0.014032697, 0.013128667, 0.011064242, -0.0013341182, -0.6079396, -0.014235091, -0.015530418, -0.01925448, 0.0027643738, 0.02628432, 0.012163919, 0.018242506, -0.020239467, -0.03470394, -0.0027930464, -0.008595027, -0.0023983768, -8.711404E-4, -0.013762837, -0.01414064, -0.014073175, -0.009512549, 0.0015761484, -0.022911076, -0.025110431, 0.011475777, 0.011327355, 0.010295142, 0.035999265, -0.0025147537, 0.019632282, -0.0058289664, -0.019794198, 0.044149023, -0.030575087, 0.01907907, -0.0147613175, -0.0038454987, 0.0539989, -0.009330394, -0.011037256, 0.02071172, 0.012845315, 0.041207556, -0.015705826, -0.004182823, 0.0073536728, -0.003724062, -0.00466857, -0.0070163487, 0.012238131, 0.0028099127, -0.008210477, -0.0083588995, -0.011934538, -0.006679024, 0.011266637, -0.013978725, 0.016528897, -0.0016950553, 0.020306932, -0.022128483, 0.01681225, -0.0022314012, 0.02065775, 0.005980762, -0.017324984, -0.013486232, 0.027930465, 0.010092747, -0.031735484, 0.013884274, 0.011745637, 4.0204858E-4, 0.013816809, 0.008284688, 0.00772473, 0.024732629, 0.012163919, 0.019200508, 0.018404422, -0.004941803, -0.004041147, -0.014302556, -0.010254663, 0.002772807, -0.028038409, 0.0015213332, 0.002821719, -0.029603593, -0.018930648, 0.0029701418, 9.504116E-4, 0.0036397309, -0.0016545764, 0.009310154, 0.0016579496, -0.011786116, -0.010112987, 0.019052085, -0.037402533, 0.0029667686, 0.026635138, -9.687537E-5, -0.0017228846, 0.0051745567, 0.015503432, -0.008541055, 0.01101027, 0.022101497, -0.019133043, -0.020738706, 0.042071104, -0.026081925, 0.003548653, -0.01694718, 0.008851393, 7.215581E-5, 0.01577329, -0.022830117, -0.004985655, 0.012926273, 0.021926088, 0.020050565, 0.042152062, -0.017136082, 0.02814635, -0.0010060702, 0.034218192, 0.024260374, -0.056346674, -0.020374397, -0.003442396, -0.017149575, -0.01427557, 0.0071175457, 0.004705676, -0.021453835, -0.010086001, -0.00860852, -0.02269519, -0.023099978, 0.013762837, -0.006625052, -0.017230533, -0.0010414892, 0.023342852, -0.020819664, 0.014343035, -0.0022448942, -0.017230533, -0.013492978, 0.006193277, -0.010315382, 0.0029718284, -0.010207438, -0.01617808, 0.0010229364, 0.00531286, 0.0104300715, -0.0050969725, -0.015881235, -0.014896248, -0.008871633, 0.009114507, 0.0147478245, -0.017999632, -0.01984817, -0.025258854, -0.024489755, 0.0029836346, 0.023680177, -0.03456901, -0.038994707, -0.00949231, -0.008541055, -8.083137E-4, 0.019996593, -0.010038775, 0.007913631, -0.0013054456, -0.005781741, 0.0060954527, -0.043717247, 0.0048372326, 0.0014951905, -0.017136082, 0.009735184, -0.0017827597, 0.0034255297, -0.0070365877, -0.014302556, -0.041072626, -0.008156505, -0.021197468, 0.012825076, 9.2848553E-4, 0.018998113, 0.022641217, 0.0025113805, 0.027174857, 0.02165623, 0.028173337, 0.03141165, 0.038859777, -0.025636658, 0.0026581166, 0.0017068617, 0.0022381477, -0.030278241, 0.0028368987, -0.0126901455, -7.189227E-4, 0.014329542, -0.013870781, -0.013047709, -0.0039163367, 0.0045167743, -0.0073604193, 0.031924386, 0.0060414807, 0.037024733, 0.0011393133, -0.009876859, -0.017770251, -0.013769584, 0.008102533, -0.0103491135, -0.0027610005, -0.0238286, 0.018148055, 0.017338477, 0.015193093, -0.0043346193, -0.012055975, -0.010146719, 6.6663744E-4, 0.01994262, -0.03335464, -0.021804651, 0.019915635, -0.0077989413, 0.026513701, 0.02351826, -0.0011671425, 0.023207922, 0.005036254, -0.00466857, -0.005559107, 0.009957817, 0.041126598, 0.0041254777, -0.0063720588, 0.020104537, 0.0047798874, -0.0060516004, -0.022924569, -0.005602959, 0.0016275904, -0.030413171, 0.031843428, 0.017797237, 0.0014074863, 0.046415843, 0.019025099, 0.017756758, -0.0060381074, 0.017014645, -3.012729E-4, 0.023855584, 0.009748677, -0.021264933, -0.025177896, 0.009580014, -0.0071310387, 0.018863183, 0.011867074, 0.0074278843, 0.013992218, -0.00240175, 0.0084128715, 8.652372E-4, 0.034218192, -0.011603961, -0.03629611, -0.027768549, -0.014977206, 0.0050193877, -0.0035958786, -0.03850896, -0.009809394, -0.026810547, -0.009053788, 0.04123454, 0.0071175457, -0.0057648746, -0.026176376, 6.2278524E-4, -0.003629611, -0.0074683633, 0.034757912, -0.012588948, -0.022641217, -0.01076065, -0.0056097056, -0.0082644485, 0.013439006, -0.015651854, -0.001314722, -9.116193E-4, -0.02745821, -0.014963713, 0.005893058, -0.02469215, 0.002415243, -0.0055388673, -0.008966084, -0.011651186, -0.0056468113, -0.011597214, -0.024435783, 0.004442563, 0.0240175, 0.0025096938, -0.012163919, -0.022357864, -0.027566154, 0.0022718802, 0.02782252, 0.015597883, -0.005586093, 0.011131707, 0.018593322, -0.029576607, -0.03891375, -0.009256183, 0.013965232, 0.0020391264, -0.0035149206, -1.875313E-4, -0.0044526826, 0.011813102, 0.02909086, 0.0012784597, -0.014397007, -0.015543911, 0.004017534, 0.023221415, -0.026904998, 0.007110799, 0.041585356, -0.00466857, -0.015935207, -0.026432743, 0.019753719, 0.048196916, 0.031114807, 0.0018063724, 0.0045842393, 4.93337E-4, -0.0127845965, -0.021966567, -0.020145016, -0.022600738, -0.005808727, 0.010807876, 0.0051374515, 0.019767212, 0.010922565, 0.035729405, 0.021831637, -0.004391964, -0.0038151394, 0.001276773, 0.01531453, 0.0016537331, -0.014882755, -4.667727E-4, 0.034110248, 0.0048068734, -0.01352671, -0.012764357, 0.0071040527, 0.027080406, -0.01994262, -0.03062906, 0.002472588, -0.013465992, 0.01604315, -0.026243841, 0.008203731, -0.0145859085, 0.009775663, -0.017244026, -0.0023545246, -0.029603593, -0.041585356, -0.005285874, 0.016609855, 0.004483042, -0.01798614, -0.037186645, -0.010180452, 0.013270344, 0.023612712, -0.006662158, -0.01781073, 0.031600554, 0.015247065, -0.0049485494, -0.0072929543, -0.025380291, -0.0013535144, 0.02165623, 2.8419585E-4, -0.027768549, -0.011334102, 0.008723211, 0.039264563, 0.017554363, 0.017662307, -0.0074953493, 0.016609855, 0.0074211378, -0.008966084, 0.007927124, -0.016420955, -0.011563482, 0.008183491, -0.036188167, -0.016555883, -0.023504768, 0.0013965232, 0.0027255814, 0.022074511, 0.030710017, -0.01835045, -0.0047933804, 0.0043953373, -0.02415243, 0.011300369, 0.008136266, 0.015381995, 0.0062641148, -0.009526042, 0.008129519, -0.016366983, -0.028901959, 0.0074683633, -0.023140457, 0.016407462, 0.002857138, -0.003889351, -0.01241354, 0.010099494, -0.038185127, 0.009013309, 0.014370021, -0.008014829, 0.008008082, -0.005548987, -0.011023763, -0.036862817, -0.015233572, -0.026500208, 0.017095603, 0.0070635737, 0.004098492, -0.0125956945, -0.0063282065, 0.015935207, -0.0031101315, 0.05132729, -0.028092379, -0.0060279877, 7.7205134E-4, -0.008642253, 0.020630764, 0.02084665, 0.007697744, -0.026810547, -0.007852913, -0.016434446, -0.06044854, 0.006554214, -0.034595996, -0.0016790325, 0.015543911, 0.0320863, 0.008365646, 0.0125956945, 0.0031860294, -0.012366314, 0.014909741, -0.008210477, 0.002295493, -0.019308452, 0.020819664, -0.0088379, 0.007245729, 0.010268156, 0.0103760995, -0.008891872, 0.0019767212, -0.013762837, 0.015597883, 0.0026716096, -0.048331846, -0.0048777116, -0.027768549, 0.02442229, 0.0038522452, 0.010409832, 5.578503E-4, 0.025812067, -0.008885126, 0.010463804, -0.0029752017, 0.012737371, -0.03643104, 0.03195137, -0.041153584, 0.028767029, -0.006102199, -0.009593507, 0.01604315, 0.007576307, 0.0125822015, 0.02628432, 0.0148287825, 0.01391126, -0.011246397, -0.0014631448, 0.017594842, -0.0495732, 0.0019025098, -0.008183491, -0.014059682, -0.001480011, -0.017257519, -0.030952891, -0.04029003, -0.0021234574, 0.0046989294, -0.0148692615, 0.018890169, -0.02125144, -0.032545064, -0.0015828949, -0.011172186, 0.03443408, -0.012265117, 0.01112496, 0.021858623, 0.004590986, -0.021359384, -0.048115958, 9.1752247E-4, 0.0076640113, 0.0074143913, 0.04622694, -0.007515589, -0.0124607645, 0.0066284253, 0.0024118698, -0.0072659687, -0.026041446, 0.015220079, 0.009074028, -0.0049215634, 0.0028436452, -0.02695897, -0.0033378254, -0.0012852062, -0.018768732, -0.021669723, -0.024449276, 0.0020104537, -0.0077854483, -0.022587245, -0.029576607, 0.00442907, 0.01667732, -0.023369838, 0.017648814, -0.013304076, -0.0036903294, 0.012285356, 0.01291278, 0.031573568, -0.021291919, -0.0066419183, -0.0010785949, -0.0075223353, -0.011340848, -0.01848538, -0.020576792, 0.04884458, -0.0030544729, -0.010679692, -0.007448124, 0.021534793, 0.025636658, -0.004290767, 0.0062033967, -0.0027255814, 0.014343035, 0.003862365, 0.019915635, 0.0060144947, -0.031870414, -0.015247065, -0.026257334, -0.010328875, -0.009998296, -0.011131707, 0.006996109, -0.014167626, -0.0106459595, 0.013115174, -0.015301037, 0.013992218, 0.0015120568, 0.006871299, 0.0053567123, 0.009350633, -0.036512, 0.017324984, -0.02269519, 0.0103895925, -0.030683031, 0.001566872, 0.017163068, -0.025663644, 0.008635506, -0.03127672, -0.0025737856, -0.033381626, 0.0025467996, -0.0048372326, 0.013924753, -0.001669756, 0.04109961, -0.012420286, -0.014437486, -0.016744785, -0.020927608, 0.01771628, -6.923584E-4, 0.016731292, -0.009795902, -0.02084665, -0.0046854364, -6.19412E-4, -0.006793714, 0.0048372326, -0.025501728, -0.03575639, -0.012696892, 0.027120885, -0.013263597, -0.02768759, -0.018930648, -0.013263597, -0.011037256, 0.0351627, 0.01785121, 0.031735484, 0.023099978, 0.004692183, 0.011313862, 3.6747282E-4, -0.010160212, -0.017149575, 0.0047529014, -0.013236611, -0.021548286, -0.009053788, -0.0072389827, 0.020455355, -0.011192425, -0.020806171, -0.0026867893, -0.0018873303, -0.007866406, 0.005967269, 0.013722358, 0.016097123, -0.0031016984, 0.017122589, -0.012352821, 0.012238131, -0.007124292, -0.012312342, -0.036862817, -0.010895579, 0.0074953493, -0.010794383, 0.005629945, 0.008048561, -0.021278426, -0.027606633, 0.0031961491, 0.0025248735, 0.0035553996, 0.014302556, 0.0055962126, -0.020091044, 0.011199172, -0.015017685, 0.011158693, 0.0056603043, -0.013074695, -0.01264292, -0.006193277, -0.0037949, 0.020428369, -0.012022243, 0.004722542, 0.009040295, -0.005090226, 0.03216726, -0.040209074, -0.0061190655, 0.004182823, -0.008196984, 0.0103895925, 0.002491141, 0.002696909, 0.03089892, 0.0025602926, -0.010794383, 0.009687958, 0.015530418, -0.011900807, -0.0083049275, 0.01938941, -0.009060535, 0.027660605, 0.007792195, 3.8291597E-5, -0.015071657, 0.011813102, 0.005599586, 0.012433779, -0.031465624, 0.024597699, 0.025245361, -0.033381626, 0.003946696, 2.0935199E-4, -0.027309787, -0.01735197, -0.01343226, 0.014046189, -0.016137602, 0.007906885, -0.0042536613, 0.017176561, -0.030305227, -0.033300668, 0.0035823856, -0.0065643336, -0.008642253, 0.28777823, -0.014302556, -0.010011789, 0.017891688, -0.010288396, 0.017271012, 0.02397702, 0.009377619, 0.016272532, 0.02388257, -0.018957634, -0.012440526, -0.008716464, -0.0017692667, 0.007562814, -0.026027953, -0.018633801, -0.00492831, -0.025407277, -0.035000782, 0.003022427, -0.017621828, -0.014882755, -0.014977206, 0.03748349, -0.016987659, -0.013857288, -0.012150426, -3.0602969E-6, 0.008136266, -0.009013309, -0.008844647, -0.008675985, -0.0151796, -0.006544094, -0.0017911928, 0.017190054, 0.0085343085, 0.03850896, 0.0103356205, -0.020279946, 0.0072389827, 0.004327873, 0.030251255, -0.014181119, 0.00947207, -0.013600921, -0.020819664, -0.012892541, 0.0146938525, -0.021858623, 0.007778702, 0.028821, 0.02501598, -0.017190054, 0.0073604193, 0.040397976, -0.0048473524, -0.007913631, 0.0075493213, 0.013506471, 0.033273682, -0.026702603, 0.016893208, -0.010888833, 0.016151095, -0.031708498, -0.0106594525, -0.013337809, -0.034595996, -0.0022162215, -0.013331062, -0.016259039, 0.0041322242, -0.012109947, -0.015206586, 0.019564819, 7.5349846E-4, 1.8247144E-4, 0.046065025, -0.012224638, -0.00126581, 4.326186E-4, -0.0070635737, -0.016164588, -0.028928945, 0.0077584623, -0.006517108, -0.0033951704, 0.013553696, 0.012791343, -0.038401015, 0.019038592, -0.018552843, 0.022263413, 0.011482524, -0.030332213, 0.033786416, -0.012292103, 0.02528584, -0.021062538, -0.034353122, 0.042313978, 0.022749161, -0.0057109026, -0.0146398805, 0.0014445919, 0.009040295, 0.020225974, -0.027566154, -0.015152614, -0.029630579, 0.0037341816, 0.011941285, -0.0106324665, -0.0053297263, 0.007684251, -0.021899102, -0.014370021, -0.0046584504, 0.044337925, -0.011496017, 0.0019902142, 0.00874345, -0.015530418, -0.017824223, -0.017635321, 0.006075213, 0.02411195, -0.043933135, 0.015408981, -0.027201843, 0.030143313, 0.025906518, 2.4160864E-4, -0.005714276, 0.028470183, -0.032895878, -0.022870596, 0.002767747, 0.012035736, -0.03003537, 0.01508515, 0.017419435, 0.017594842, -0.014491458, 0.019429889, 0.0054882686, 0.041288514, -0.0017001152, -0.036350083, 0.0042435415, 0.017905181, 0.0032045823, -0.0040344005, -0.015489939, -0.0401551, -0.026675617, 8.175901E-4, 0.018971127, -0.046388857, -6.396515E-4, 0.040802766, -0.021777665, -0.010011789, 0.007657265, -0.17238629, 0.009890352, 0.0318974, 9.394486E-4, 0.014882755, 0.011968271, -0.0056872903, 0.010092747, -0.0024827078, -0.004590986, 0.03416422, 0.0013299016, -0.019753719, -0.023491275, -0.0057210224, -0.0061224387, -0.019915635, 0.0018940767, 0.023329359, -0.00607184, 0.04123454, -0.028794015, 0.010032029, 0.0029009902, 0.0147208385, 0.013742598, 0.013358048, 0.031006863, -0.0011848521, -0.0042435415, -0.01087534, 0.0148017965, 0.038616903, 0.021804651, 0.024030993, -0.009114507, -0.0032754203, -5.1905797E-4, -0.019767212, -2.1346312E-5, 0.007825927, 0.030979877, 0.015435967, 0.014046189, -0.025798574, 0.0416933, 0.022897583, 0.005191423, -0.0028588246, 0.0024169297, 0.007711237, -0.019834677, 0.017702786, 2.9178563E-4, 0.031249737, -0.0025737856, 9.411352E-4, 0.02224992, -0.0104975365, 0.015867742, -0.01112496, -0.0085747875, 0.0016183141, -0.0021588765, 0.011543242, -0.028011423, -0.01454543, -0.006034734, -0.019497354, 0.010787636, -0.01621856, -0.03521667, 0.029279761, -0.02601446, 0.018822704, 0.011165439, -0.018768732, 0.015867742, 0.011327355, 4.0563266E-4, -0.018256, 0.028605113, -0.0063585658, 0.015004192, 0.011657933, 0.011199172, -0.0027761802, -0.0041322242, 0.006267488, -0.033273682, 0.040397976, -0.036592957, 0.011590468, -0.0037780337, -0.011273383, 0.018890169, 0.008379139, 0.0103356205, 0.02542077, -0.015611376, -0.015516925, -0.0031641033, -0.0022971795, 0.02990044, 0.007906885, 0.013715612, 0.026446236, 0.02424688, 0.014370021, -0.004442563, 0.001862031, -0.010409832, 0.016717799, 0.020495834, 0.023774628, 0.010139973, -0.022236427, 0.0086287595, 0.004995775, -0.0040074144, -0.016029658, -0.013223118, -0.013654893, 0.003250121, -0.0050025214, -0.033516556, -0.06973171, -0.033786416, 0.022776147, 0.02569063, -0.006129185, 0.014127147, 0.012244877, 0.023842093, -0.011536496, 0.033948332, -0.0082509555, -0.016286025, 2.1030071E-4, 0.0026463103, -0.00506324, -0.012366314, 0.027336773, -0.022803131, -0.0085882805, 0.018309971, 0.0050564934, 0.006679024, 0.0054106843, -0.0058998046, 0.010463804, -0.0055928393, -0.018552843, 0.043285474, 0.004709049, 0.015827263, 0.0148017965, -0.018229013, -1.6486732E-4, 3.695811E-4, 0.0015921714, 0.0103626065, -0.024368318, -0.019996593, 0.023086485, -0.032653008, 0.0035857589, 0.0018772106, -0.013331062, -0.023099978, -0.030278241, 2.6922708E-4, 0.0020188869, 0.025029473, -0.016488418, -0.0059976284, -0.018863183, 0.005643438, 0.0066958903, 0.008055308, 0.02828128, -0.018971127, 0.014248584, 0.004577493, -0.011846835, 0.005785114, 0.019605298, -3.383786E-4, -0.01635349, 0.025299333, 9.4197853E-4, -0.015935207, -0.00885814, -0.0074008983, -0.0013079755, -0.019834677, -0.010531269, 0.044202995, 0.008075547, 0.009951071, -0.022830117, -0.008777182, 0.0026952224, -0.023963528, 0.003017367, -0.043231502, -0.017176561, -0.024179416, 0.010720171, -0.0071040527, 0.006227009, 0.0025653525, 0.008682732, -0.018620308, 0.021831637, -0.02895593, -0.027485196, 0.0053061135, 0.031519596, -0.012872301, -0.012420286, 0.0032433746, 0.004445936, -0.024665164, 0.014923234, -0.011017016, 0.011698412, -0.03276095, -0.0482239, 0.027903479, 0.0084533505, -0.008129519, 0.027336773, 0.014477965, 0.019402903, -0.022303892, 0.006253995, -0.010598734, -0.028821, 0.0028672577, 0.0037105689, -0.0013467679, -0.037726365, -0.044958603, 0.013729105, 0.0041187312, 0.016501911, 0.02828128, 0.009735184, 0.0105380155, 0.0145859085, -0.0031016984, -0.004897951, -0.017567856, -0.013250104, -8.8547665E-4, -0.008662492, 0.011691665, 0.024773108, -0.023437303, -0.0051104655, 0.010828115, -0.01735197, -0.031330694, 0.027876493, 0.003973682, 0.0054241773, -0.0023376583, -0.030817961, -0.026176376, 0.015827263, 0.005272381, -0.029630579, 0.0146398805, -0.035378587, -0.0021858625, 0.060286626, -0.0040883725, 0.0040782527, 0.017999632, -0.0036734631, -0.008878379, -0.007751716, -0.027741563, 0.018741746, -0.009330394, 0.021103017, -0.032005344, 0.017190054, 0.013256851, 0.009316901, -0.0034120367, 0.030251255, -0.0021285173, 0.017203547, 0.0059132976, -5.498388E-4, -0.022506287, 0.009863366, 0.020752199, 0.018498873, 0.022857103, 0.027579647, -0.008999816, 0.026270827, -0.006942137, -6.232069E-4, 0.022870596, 0.0044661756, -0.008797421, -0.013081442, -0.006544094, 0.020779185, 0.018552843, -0.009478817, 0.029873453, -0.008891872, 0.021224454, -0.0025737856, 0.028497169, -0.03643104, -0.002870631, 0.005903178, 0.054187797, -0.0050463737, -0.011219411, 0.028362239, 0.0199831, 0.0024540352, -0.0023325984, -0.0035790124, -0.0084398575, -0.01708211, 0.013573935, -0.01012648, -5.5321207E-4, -0.025987474, 0.010032029, 0.008622013, -0.009384366, 0.0073874053, 0.031006863, 0.00491819, -0.008190238, -0.018539352, -0.030278241, -0.0126091875, -0.0060414807, 0.026581166, 0.011718651, 0.01076065, -0.018282985, 0.023531754, 0.0038050197, 0.011239651, -0.012332582, 0.015570897, 0.023734149, -0.0017287878, -0.013209625, -0.018417915, -0.022438822, -0.005586093, -0.0032956598, -0.007987843, 0.03570242, -0.03783431, 0.019052085, -0.017109096, -0.004843979, 0.01898462, -0.007859659, 2.0534625E-4, -0.009694705, -0.012872301, -0.026108911, -0.009121253, -0.007846166, -0.033138752, 0.0021200841, 0.0017827597, -0.03975031, 0.012346075, -2.6754045E-4, 0.01907907, -0.009451831, -0.01150951, 0.030116327, -0.0021892355, 0.018998113, 0.02768759, -5.869445E-4, -0.0037173154, 0.00935738, 0.010726918, -0.0062573683, -0.008662492, 0.026027953, 0.014221598, -0.006611559, -0.028038409, 0.021602258, 0.010382846, -0.0030072473, 0.0050328807, 0.025542207, -8.2981813E-4, -0.0056333183, 0.02976551, -0.012750864, -0.02038789, 0.034865856, -0.013047709, 2.5362583E-4, -0.00390959, -0.019092564]
16:33:36.203 [main] INFO  cn.jdl.tech_and_data.ka.OpenAIEmbeddingTest - 它是1536维的向量           

3.2.5 Vector Repository Storage

A vector database, also known as a vector store or vector search engine, is a database specifically designed to store and manage vectors (fixed-length lists of numbers) and other data items.

These vectors are mathematical representations of data points in a high-dimensional space, where each dimension corresponds to a feature of the data. The main purpose of vector databases is to achieve efficient similarity search through the Approximate Nearest Neighbor (ANN) algorithm.

Before using the vector library, you need to start chromdb, connect to the vector library through the LangChain4j encapsulated SDK, and create a data storage container, that is, a collection (equivalent to MySQL tables)

Client client = new Client(CHROMA_URL);
EmbeddingFunction embeddingFunction = new OpenAIEmbeddingFunction(API_KEY, OPEN_AI_MODULE_NAME);
client.createCollection(CHROMA_DB_DEFAULT_COLLECTION_NAME,null,true, embeddingFunction);           

After embedding, connect to the vector library through the SDK, bind the vector (Embedding) and the text segment (TextSegment), and store them in the vector library.

EmbeddingStore<TextSegment> embeddingStore = ChromaEmbeddingStore.builder().baseUrl(CHROMA_URL).collectionName(CHROMA_DB_DEFAULT_COLLECTION_NAME).build();
segments.forEach(segment->{
  Embedding embedding = embeddingModel.embed(segment).content();
  embeddingStore.add(embedding, segment);
});           

3.2.6 Vector library retrieval

In order to query similar pieces of knowledge in the vector library, the text used as the query also needs to be vectorized, and the method is the same as above.

Embedding queryEmbedding = embeddingModel.embed(qryText).content();            

During retrieval, the vector library uses the ANN algorithm to find the closest text fragments (depending on the query input parameters) to the queryEmbedding.

EmbeddingSearchRequest embeddingSearchRequest = EmbeddingSearchRequest.builder().queryEmbedding(queryEmbedding).maxResults(1).build();
EmbeddingSearchResult<TextSegment> embeddedEmbeddingSearchResult = embeddingStore.search(embeddingSearchRequest);
List<EmbeddingMatch<TextSegment>> embeddingMatcheList = embeddedEmbeddingSearchResult.matches();
EmbeddingMatch<TextSegment> embeddingMatch = embeddingMatcheList.get(0);
TextSegment TextSegment = embeddingMatch.embedded();           

The query method has four input parameters: queryEmbedding, maximum number of queries (how many vectors are closest to the query), minimum score (filter some candidate values by this value), and metadata filter (filter some candidate values based on metadata)

Java developer LLM in practice - using LangChain4j to build a native RAG system



3.2.7 Interacting with LLMs

Define a Prompt template that tells the LLM to answer the raised question with a given context:

基于如下信息进行回答:\n{{context}}\n提问:\n{{question}}           
PromptTemplate promptTemplate = PromptTemplate.from("基于如下信息进行回答:\n" +
                "{{context}}\n" +
                "提问:\n" +
                "{{question}}");
Map<String, Object> variables = new HashMap<>();
variables.put("context", textSegment.text());
variables.put("question", QUESTION);
prompt = promptTemplate.apply(variables);           

Combine the Prompt information into the input parameters of the request LLM:

OpenAiChatModel openAiChatModel =  OpenAiChatModel.builder().apiKey(API_KEY).baseUrl(BASE_URL).modelName(OPEN_AI_MODULE_NAME).temperature(TEMPERATURE_NO_RANDOM).build();
UserMessage userMessage = prompt.toUserMessage();
Response<AiMessage> aiMessageResponse = openAiChatModel.generate(userMessage);
String response = aiMessageResponse.content();           

3.3 Testing and Verification

17:47:57.060 [main] INFO  cn.jdl.tech_and_data.ka.Utils - 当前的模型是:gpt-4o
17:47:57.067 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - [without RAG]userMessage=UserMessage { name = null contents = [TextContent { text = "请给我讲一个关于冰淇淋的笑话" }] }
17:48:00.129 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - 不使用RAG,直接询问AI:[请给我讲一个关于冰淇淋的笑话]
 得到的回答是:[当然!这里有一个关于冰淇淋的笑话,希望你会喜欢:

为什么冰淇淋总是很开心?

因为它有很多“甜蜜”的朋友!]
17:48:00.269 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - version of chroma db is : 0.5.0
17:48:02.550 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - text:听说夏天和冰淇淋更配哦?不存在的,冰淇淋在我嘴里早就化了,连味道都没来得及品尝!, embeddingId=bd89a581-e764-47fc-967c-68899d6be904
17:48:03.811 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - text:有一次,一个女孩因为太想要一支冰激凌而得不到,结果她竟然在超市的冰激凌货架前大声呼唤:“天啊,我渴望你!你能把冰激凌赐给我吗?”你猜怎么着?她得到了那支冰激凌,但是当她回家后发现里面全是空的!, embeddingId=6f84ab62-635f-4bd0-a3d2-ff55d00c4b01
17:48:05.534 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - text:冰淇淋的冷,你绝对想象不到,它不仅冷静而且淡定的在我嘴里说着:“我是你的心头痛。”, embeddingId=c9d15213-2146-4f7b-8b23-c95d858b4ded
17:48:06.843 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - text:半价冰淇淋妈妈带着小明去买冰淇淋,小明兴致勃勃地挑选了一个自己喜欢的口味。妈妈结账时,发现价格是双倍的,于是问店员为什么。店员说:“这个冰淇淋是两种口味的双球冰淇淋。” 小明听了,恍然大悟:“难怪它长了两只眼睛!”, embeddingId=042ee87a-bc0c-4ba2-aea5-dbef45c32d33
17:48:08.452 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - text:有一次我家的冰箱坏了,里面的冰激凌都开始融化。我看到了一个很棒的创意——把所有的冰激凌球放在一个碗里,然后放入冰箱冷冻。结果我得到了一个巨大的冰激凌球混合物,这可是一道奇特的甜点!

麒麟飞到北极会变成什么? 答案:冰淇淋。, embeddingId=e7436694-821e-4559-89d3-2f09be2e172a
17:48:08.453 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - 通过关键词:【两只眼睛】查询向量库
17:48:10.252 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - 查询到的文档信息是 matchedTextSegment: TextSegment { text = "半价冰淇淋妈妈带着小明去买冰淇淋,小明兴致勃勃地挑选了一个自己喜欢的口味。妈妈结账时,发现价格是双倍的,于是问店员为什么。店员说:“这个冰淇淋是两种口味的双球冰淇淋。” 小明听了,恍然大悟:“难怪它长了两只眼睛!”" metadata = {absolute_directory_path=/Users/qihaizhi/newJavaEngineerOrientation/LangChain/core/target/classes, file_name=笑话.txt, index=3} }
17:48:10.258 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - [with RAG]userMessage=UserMessage { name = null contents = [TextContent { text = "请基于如下信息进行回答,尽量使用信息中的所有词语:
半价冰淇淋妈妈带着小明去买冰淇淋,小明兴致勃勃地挑选了一个自己喜欢的口味。妈妈结账时,发现价格是双倍的,于是问店员为什么。店员说:“这个冰淇淋是两种口味的双球冰淇淋。” 小明听了,恍然大悟:“难怪它长了两只眼睛!”
提问:
请给我讲一个关于冰淇淋的笑话" }] }
17:48:16.663 [main] INFO  cn.jdl.tech_and_data.ka.RagChat - 使用RAG,询问AI:[请给我讲一个关于冰淇淋的笑话],
 得到的回答是:[有一天,妈妈带着小明去买冰淇淋。小明兴致勃勃地挑选了一个自己喜欢的口味。妈妈结账时,发现价格是双倍的,于是问店员为什么。店员说:“这个冰淇淋是两种口味的双球冰淇淋。” 小明听了,恍然大悟:“难怪它长了两只眼睛!”

这个笑话是不是很有趣呢?冰淇淋不仅美味,还能带来欢乐!]           

4. Summary and outlook

Through the practical examples in this article, we introduce in detail how to use the LangChain4j framework to implement local large-scale model applications based on RAG technology.

Through continuous exploration and optimization, the potential of RAG technology in large-scale model applications will be more fully realized, providing more intelligent and efficient solutions for various business scenarios.

Hopefully, this article will provide Java engineers with a clear practical guide to help you go further on the road of large model development.

Read on