1、引言
由于目前比較火的chatGPT是預訓練模型,而訓練一個大模型是需要較長時間(參數越多學習時間越長,保守估計一般是幾個月,不差錢的可以多用點GPU縮短這個時間),這就導緻了它所學習的知識不會是最新的,最新的chatGPT-4o隻能基于2023年6月之前的資料進行回答,距離目前已經快一年的時間,如果想讓GPT基于近一年的時間回複問題,就需要RAG(檢索增強生成)技術了。
此外,對于公司内部的私有資料,為了資料安全、商業利益考慮,不能放到網際網路上的資料,是以GPT也沒有這部分的知識,如果需要GPT基于這部分私有的知識進行回答,也需要使用RAG技術。
本文将通過實戰代碼示例,意在幫助沒有大模型實戰經驗的Java工程師掌握使用LangChain4j架構進行大模型開發。
2、基本概念
2.1 什麼是RAG
RAG(Retrieval-Augmented Generation)的核心思想是:将傳統的資訊檢索(IR)技術與現代的生成式大模型(如chatGPT)結合起來。
具體來說,RAG模型在生成答案之前,會首先從一個大型的文檔庫或知識庫中檢索到若幹條相關的文檔片段。再将這些檢索到的片段作為額外的上下文資訊,輸入到生成模型中,進而生成更為準确和資訊豐富的文本。
RAG的工作原理可以分為以下幾個步驟:
1.接收請求:首先,系統接收到使用者的請求(例如提出一個問題)。
2.資訊檢索(R):系統從一個大型文檔庫中檢索出與查詢最相關的文檔片段。這一步的目标是找到那些可能包含答案或相關資訊的文檔。
3.生成增強(A):将檢索到的文檔片段與原始查詢一起輸入到大模型(如chatGPT)中,注意使用合适的提示詞,比如原始的問題是XXX,檢索到的資訊是YYY,給大模型的輸入應該類似于:請基于YYY回答XXXX。
4.輸出生成(G):大模型基于輸入的查詢和檢索到的文檔片段生成最終的文本答案,并傳回給使用者。
第2步驟中的資訊檢索,不一定必須使用向量資料庫,可以是關系型資料庫(如MySQL)或全文搜尋引擎(如Elasticsearch, ES),
但大模型應用場景廣泛使用向量資料庫的原因是:在大模型RAG的應用場景中,主要是要查詢相似度高的某幾個文檔,而不是精确的查找某一條(MySQL、ES擅長)。
相似度高的兩個文檔,可能不包含相同的關鍵詞。 例如,句子1: "他很高興。" 句子2: "他感到非常快樂。" 雖然都是描述【他】很開心快樂的心情,但是不包含相同的關鍵詞;
包含相同的關鍵詞的兩個文檔可能完全沒有關聯,例如:句子1: "他喜歡蘋果。" 句子2: "蘋果是一家大公司。" 雖然都包含相同的關鍵詞【蘋果】,但兩句話的相似度很低。
2.2 LangChain4j簡介
LangChain4j是LangChiain的java版本,
LangChain的Lang取自Large Language Model,代表大語言模型,
Chain是鍊式執行,即把語言模型應用中的各功能子產品化,串聯起來,形成一個完整的工作流。
它是面向大語言模型的開發架構,意在封裝與LLM對接的細節,簡化開發流程,提升基于LLM開發的效率。
更多介紹,詳見: https://github.com/langchain4j/langchain4j/blob/main/README.md
2.3 大模型開發 vs. 傳統JAVA開發
大模型開發——大模型實作業務邏輯:
開發前,開發人員關注資料準備(進行訓練)、選擇和微調模型(得到更好的效果,更能比對業務預期),
開發過程中(大多數時候),重點在于如何有效的與大模型(LLM)進行溝通,利用LLM的專業知識解決特定的業務問題,
開發中更關注如何描述問題(提示工程 Propmt Engineering)進行有效的推理,關注如何将大模型的使用內建到現有的業務系統中。
傳統的JAVA開發——開發者實作業務邏輯:
開發前,開發人員關注系統架構的選擇(高并發、高可用),功能的拆解、子產品化等設計。
開發過程中(大多數時候)是根據特定的業務問題,設計特定的算法、資料存儲等以實作業務邏輯,以編碼為主。
3. 實戰經驗
3.1 環境搭建
3.1.1 向量庫(Chroma)
Windows:
先安裝python,參考: https://docs.python.org/zh-cn/3/using/windows.html#the-full-installer
PS:注意需要配置環境變量
驗證-執行:
python --version
再安裝chroma,參考:https://docs.trychroma.com/getting-started
驗證-執行:
chroma run
Mac:
現先安裝python
brew install python
或者下載下傳安裝: https://www.python.org/downloads/macos/
驗證-執行:
python --version
安裝chroma(同上),參考:https://docs.trychroma.com/getting-started
驗證-執行:
chroma run
3.1.2 內建LangChain4j
<properties>
<langchain4j.version>0.31.0</langchain4j.version>
</properties>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-core</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-embeddings</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-chroma</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>io.github.amikos-tech</groupId>
<artifactId>chromadb-java-client</artifactId>
<version>${langchain4j.version}</version>
</dependency>
3.2 程式編寫
3.2.1 項目結構
LangChain ├── core │ ├── src │ │ ├── main │ │ │ ├── java │ │ │ │ └── cn.jdl.tech_and_data.ka │ │ │ │ ├── ChatWithMemory │ │ │ │ ├── Constants │ │ │ │ ├── Main │ │ │ │ ├── RagChat │ │ │ │ └── Utils │ │ │ ├── resources │ │ │ │ ├── log4j2.xml │ │ │ │ └── 笑話.txt │ │ ├── test │ │ │ └── java │ ├── target ├── pom.xml ├── parent [learn.langchain.parent] ├── pom.xml
3.2.2 知識采集
一般是公司内網的知識庫中或網際網路上進行資料采集,擷取到的文本檔案、WORD文檔或PDF檔案,本文使用resources目錄下的【笑話.txt】作為知識采集的結果檔案
URL docUrl = Main.class.getClassLoader().getResource("笑話.txt");
if(docUrl==null){
log.error("未擷取到檔案");
}
Document document = getDocument(docUrl);
if(document==null){
log.error("加載檔案失敗");
}
private static Document getDocument(URL resource) {
Document document = null;
try{
Path path = Paths.get(resource.toURI());
document = FileSystemDocumentLoader.loadDocument(path);
}catch (URISyntaxException e){
log.error("加載檔案發生異常", e);
}
return document;
}
3.2.3 文檔切分
使用dev.langchain4j.data.document.splitter.DocumentSplitters#recursize
它有三個參數:分段大小(一個分段中最大包含多少個token)、重疊度(段與段之前重疊的token數)、分詞器(将一段文本進行分詞,得到token)
其中,重疊度的設計是為了減少按大小拆分後切斷原來文本的語義,使其盡量完整。
DocumentSplitter splitter = DocumentSplitters.recursive(150,10,new OpenAiTokenizer());
splitter.split(document);
關于Token(标記):
Token是經過分詞後的文本機關,即将一個文本分詞後得到的詞、子詞等的個數,具體取決于分詞器(Tokenizer),
比如:我喜歡吃蘋果,可以拆分成我/喜歡/吃/蘋果,token數量=4, 也可以拆分成我/喜/歡/吃/蘋果,token數量=5
chatGPT使用的是BPE(Byte Pair Encoding)算法進行分詞,參見: https://en.wikipedia.org/wiki/Byte_pair_encoding
對于上面文本的分詞結果如下:
18:17:29.371 [main] INFO TokenizerTest - 待分詞的文本:我喜歡吃蘋果
18:17:30.055 [main] INFO cn.jdl.tech_and_data.ka.Utils - 目前的模型是:gpt-4o
18:17:31.933 [main] INFO TokenizerTest - 分詞結果:我 / 喜歡 / 吃 / 蘋果
關于token與字元的關系:GPT-4o的回複:
關于文檔拆分的目的:
由于與LLM互動的時候輸入的文本對應的token長度是有限制的,輸入過長的内容,LLM會無響應或直接該報錯,
是以不能将所有相關的知識都作為輸入給到LLM,需要将知識文檔進行拆分,存儲到向量庫,
每次調用LLM時,先找出與提出的問題關聯度最高的文檔片段,作為參考的上下文輸入給LLM。
入參過長,LLM報錯:
雖然根據響應,允許輸入1048576個字元=1024K個字元=1M個字元,
但官網文檔給的32K tokens,而一般1個中文字元對應1-2個Token,是以字元串建議不大于64K,實際使用中,為了保障性能,也是要控制輸入不要過長。
如下是常見LLM給定的token輸入上限:
模型名稱 | Token 輸入上限(最大長度) |
GPT-3 (davinci) | 4096 tokens |
GPT-3.5 (text-davinci-003) | 4096 tokens |
GPT-4 (8k context) | 8192 tokens |
GPT-4 (32k context) | 32768 tokens |
LLaMA (7B) | 2048 tokens |
LLaMA (13B) | 2048 tokens |
LLaMA (30B) | 2048 tokens |
LLaMA (65B) | 2048 tokens |
訊飛星火(SparkDesk) | 8192 tokens |
文心一言(Ernie 3.0) | 4096 tokens |
智源悟道(WuDao 2.0) | 2048 tokens |
阿裡巴巴 M6 | 2048 tokens |
華為盤古(Pangu-Alpha) | 2048 tokens |
言犀大模型(ChatJd) | 2048 tokens |
文檔拆分的方案langchain4j中提供了6種:
1、基于字元的:逐個字元(含空白字元)分割
2、基于行的:按照換行符(\n)分割
3、基于段落的:按照連續的兩個換行符(\n\n)分割
4、基于正則的:按照自定義正規表達式分隔
5、基于句子的(使用Apache OpenNLP,隻支援英文,是以可以忽略)
6、基于字的:将文本按照空白字元分割
文檔切分的流程如下,其中segments是最終輸出的拆分結果,類型是:List<TextSegment>。
先使用基于段落的方案将整個文檔切成若幹段(分段):parts,
再對每個段落part按照其他的某個方案(如:分句),并在滿足【分段大小(一個分段中最大包含多少個token)】的條件下進行,同時計算重疊部分,按照【重疊度(段與段之前重疊的token數)】補充重疊資訊。
3.2.4 文本向量化
由于需要将已拆分的知識片段文本存儲向量庫以便後續可以進行檢索,而向量庫存儲的資料是向量不是文本,
是以需要将文本進行向量化,即将一個字元串轉換為一個N維數組,這個過程在自然語言處理(NLP)領域稱為文本嵌入(Words Embedding)。
不同的LLM對于文本嵌入的實作是不同的,ChatGPT的實作是基于transformer架構的,相關實作存儲在服務端,每次嵌入都需要通路OpenAI的HTTP接口。
通過下面的例子可以看到OpenAi使用的模型是:text-embedding-ada-002,向量的次元是:1536。
OpenAiEmbeddingModel embeddingModel = new OpenAiEmbeddingModel.OpenAiEmbeddingModelBuilder().apiKey(API_KEY).baseUrl(BASE_URL).build();
log.info("目前的模型是: {}", embeddingModel.modelName());
String text = "兩隻眼睛";
Embedding embedding = embeddingModel.embed(text).content();
log.info("文本:{}的嵌入結果是:\n{}", text, embedding.vectorAsList());
log.info("它是{}維的向量", embedding.dimension());
16:33:34.096 [main] INFO cn.jdl.tech_and_data.ka.OpenAIEmbeddingTest - 目前的模型是: text-embedding-ada-002
16:33:36.201 [main] INFO cn.jdl.tech_and_data.ka.OpenAIEmbeddingTest - 文本:兩隻眼睛的嵌入結果是:
[-0.010065761, -0.0021892355, 0.007137785, -0.004445936, -0.018269492, 0.014046189, 0.010794383, -0.021953074, -0.008244209, -0.027377252, 0.0036970759, 0.028308267, 5.81463E-4, 0.0021504434, -0.0068240734, 0.012217891, 0.020495834, 0.0020880383, 0.021278426, -0.024273867, -0.005393818, 0.0036970759, 0.011563482, -0.0073604193, -0.019861663, 0.0036768364, 0.025069952, 0.0060010017, -0.011023763, -0.021089524, 0.025123924, -0.013189386, -0.025677137, -0.011671426, 0.011239651, 0.013816809, -0.008217223, 0.0146938525, 0.016407462, 0.005839086, 0.008790675, 0.015476446, 8.72827E-4, 0.0318974, -0.011212665, 0.0010473924, -0.027566154, 0.005366832, -0.020306932, 0.022479301, -0.0064192843, -0.0074346308, -0.028740043, -0.012055975, -0.015071657, -9.799275E-4, -0.004068133, -0.011900807, -0.013783077, -0.005191423, -4.3978673E-4, 0.0073199403, -0.023261894, 0.0011991884, -0.030683031, 0.0053330995, 0.0060819597, -0.0063821785, -5.401408E-4, -0.011219411, 0.029927425, 0.020050565, 0.005994255, -0.016717799, 0.04725241, -0.013735851, -0.027984437, -0.037645407, -6.0507574E-4, -0.002486081, 0.017365463, -0.031222751, -0.0086287595, 0.01062572, -0.016960673, 6.588368E-5, -0.007083813, 0.029846467, -0.02215547, -0.025312826, 0.009283168, 0.013533456, -0.007562814, 0.0075898, -0.023046006, 0.031708498, 0.015355009, 0.04325849, -0.011597214, -0.017648814, 0.004995775, -0.015004192, -0.031384666, 0.0016655395, -0.014437486, -0.002553546, 0.018755239, -0.002821719, 0.016461432, -0.015328023, -0.016461432, 0.004800127, 0.022816624, -0.026878012, -0.023032513, 8.2686654E-4, 0.025474742, -0.011806356, -0.04123454, -0.011482524, 0.0049249367, 0.010861848, 0.0069084046, -0.015058164, 0.022182455, 0.020603778, -0.019133043, -2.224233E-4, 0.0052892473, 0.0039669354, 0.04010113, -0.0034322762, 0.02782252, 0.016420955, -0.0103626065, 0.02841621, -0.007900138, 0.01867428, -0.019875156, -0.0020593656, 0.0045336406, 0.033246696, -0.019510847, -0.0082509555, 0.0024405424, 0.022600738, 0.004155837, 0.027134378, 0.0050058947, -0.0032400014, 0.014909741, -0.010227677, 0.022317385, 0.0035149206, -0.0036700899, 0.0026716096, 0.004617972, 0.02202054, -0.030197285, 0.0147613175, -0.01835045, 0.0144105, 0.031384666, -0.015894728, 0.02841621, 0.023315866, -0.01200875, 0.008878379, -7.5349846E-4, 0.0050295074, -0.0030477263, 0.013385034, 8.340347E-4, 0.005464656, 0.023639698, 0.021426849, 0.0020559924, -0.0013518278, -0.0071850107, -0.012319089, -0.04951923, -0.016326504, -0.0025973981, -0.0010473924, 0.0013956799, -0.015867742, -0.008149759, -0.0035081743, 0.009924085, 0.0011915986, 0.014032697, 0.013128667, 0.011064242, -0.0013341182, -0.6079396, -0.014235091, -0.015530418, -0.01925448, 0.0027643738, 0.02628432, 0.012163919, 0.018242506, -0.020239467, -0.03470394, -0.0027930464, -0.008595027, -0.0023983768, -8.711404E-4, -0.013762837, -0.01414064, -0.014073175, -0.009512549, 0.0015761484, -0.022911076, -0.025110431, 0.011475777, 0.011327355, 0.010295142, 0.035999265, -0.0025147537, 0.019632282, -0.0058289664, -0.019794198, 0.044149023, -0.030575087, 0.01907907, -0.0147613175, -0.0038454987, 0.0539989, -0.009330394, -0.011037256, 0.02071172, 0.012845315, 0.041207556, -0.015705826, -0.004182823, 0.0073536728, -0.003724062, -0.00466857, -0.0070163487, 0.012238131, 0.0028099127, -0.008210477, -0.0083588995, -0.011934538, -0.006679024, 0.011266637, -0.013978725, 0.016528897, -0.0016950553, 0.020306932, -0.022128483, 0.01681225, -0.0022314012, 0.02065775, 0.005980762, -0.017324984, -0.013486232, 0.027930465, 0.010092747, -0.031735484, 0.013884274, 0.011745637, 4.0204858E-4, 0.013816809, 0.008284688, 0.00772473, 0.024732629, 0.012163919, 0.019200508, 0.018404422, -0.004941803, -0.004041147, -0.014302556, -0.010254663, 0.002772807, -0.028038409, 0.0015213332, 0.002821719, -0.029603593, -0.018930648, 0.0029701418, 9.504116E-4, 0.0036397309, -0.0016545764, 0.009310154, 0.0016579496, -0.011786116, -0.010112987, 0.019052085, -0.037402533, 0.0029667686, 0.026635138, -9.687537E-5, -0.0017228846, 0.0051745567, 0.015503432, -0.008541055, 0.01101027, 0.022101497, -0.019133043, -0.020738706, 0.042071104, -0.026081925, 0.003548653, -0.01694718, 0.008851393, 7.215581E-5, 0.01577329, -0.022830117, -0.004985655, 0.012926273, 0.021926088, 0.020050565, 0.042152062, -0.017136082, 0.02814635, -0.0010060702, 0.034218192, 0.024260374, -0.056346674, -0.020374397, -0.003442396, -0.017149575, -0.01427557, 0.0071175457, 0.004705676, -0.021453835, -0.010086001, -0.00860852, -0.02269519, -0.023099978, 0.013762837, -0.006625052, -0.017230533, -0.0010414892, 0.023342852, -0.020819664, 0.014343035, -0.0022448942, -0.017230533, -0.013492978, 0.006193277, -0.010315382, 0.0029718284, -0.010207438, -0.01617808, 0.0010229364, 0.00531286, 0.0104300715, -0.0050969725, -0.015881235, -0.014896248, -0.008871633, 0.009114507, 0.0147478245, -0.017999632, -0.01984817, -0.025258854, -0.024489755, 0.0029836346, 0.023680177, -0.03456901, -0.038994707, -0.00949231, -0.008541055, -8.083137E-4, 0.019996593, -0.010038775, 0.007913631, -0.0013054456, -0.005781741, 0.0060954527, -0.043717247, 0.0048372326, 0.0014951905, -0.017136082, 0.009735184, -0.0017827597, 0.0034255297, -0.0070365877, -0.014302556, -0.041072626, -0.008156505, -0.021197468, 0.012825076, 9.2848553E-4, 0.018998113, 0.022641217, 0.0025113805, 0.027174857, 0.02165623, 0.028173337, 0.03141165, 0.038859777, -0.025636658, 0.0026581166, 0.0017068617, 0.0022381477, -0.030278241, 0.0028368987, -0.0126901455, -7.189227E-4, 0.014329542, -0.013870781, -0.013047709, -0.0039163367, 0.0045167743, -0.0073604193, 0.031924386, 0.0060414807, 0.037024733, 0.0011393133, -0.009876859, -0.017770251, -0.013769584, 0.008102533, -0.0103491135, -0.0027610005, -0.0238286, 0.018148055, 0.017338477, 0.015193093, -0.0043346193, -0.012055975, -0.010146719, 6.6663744E-4, 0.01994262, -0.03335464, -0.021804651, 0.019915635, -0.0077989413, 0.026513701, 0.02351826, -0.0011671425, 0.023207922, 0.005036254, -0.00466857, -0.005559107, 0.009957817, 0.041126598, 0.0041254777, -0.0063720588, 0.020104537, 0.0047798874, -0.0060516004, -0.022924569, -0.005602959, 0.0016275904, -0.030413171, 0.031843428, 0.017797237, 0.0014074863, 0.046415843, 0.019025099, 0.017756758, -0.0060381074, 0.017014645, -3.012729E-4, 0.023855584, 0.009748677, -0.021264933, -0.025177896, 0.009580014, -0.0071310387, 0.018863183, 0.011867074, 0.0074278843, 0.013992218, -0.00240175, 0.0084128715, 8.652372E-4, 0.034218192, -0.011603961, -0.03629611, -0.027768549, -0.014977206, 0.0050193877, -0.0035958786, -0.03850896, -0.009809394, -0.026810547, -0.009053788, 0.04123454, 0.0071175457, -0.0057648746, -0.026176376, 6.2278524E-4, -0.003629611, -0.0074683633, 0.034757912, -0.012588948, -0.022641217, -0.01076065, -0.0056097056, -0.0082644485, 0.013439006, -0.015651854, -0.001314722, -9.116193E-4, -0.02745821, -0.014963713, 0.005893058, -0.02469215, 0.002415243, -0.0055388673, -0.008966084, -0.011651186, -0.0056468113, -0.011597214, -0.024435783, 0.004442563, 0.0240175, 0.0025096938, -0.012163919, -0.022357864, -0.027566154, 0.0022718802, 0.02782252, 0.015597883, -0.005586093, 0.011131707, 0.018593322, -0.029576607, -0.03891375, -0.009256183, 0.013965232, 0.0020391264, -0.0035149206, -1.875313E-4, -0.0044526826, 0.011813102, 0.02909086, 0.0012784597, -0.014397007, -0.015543911, 0.004017534, 0.023221415, -0.026904998, 0.007110799, 0.041585356, -0.00466857, -0.015935207, -0.026432743, 0.019753719, 0.048196916, 0.031114807, 0.0018063724, 0.0045842393, 4.93337E-4, -0.0127845965, -0.021966567, -0.020145016, -0.022600738, -0.005808727, 0.010807876, 0.0051374515, 0.019767212, 0.010922565, 0.035729405, 0.021831637, -0.004391964, -0.0038151394, 0.001276773, 0.01531453, 0.0016537331, -0.014882755, -4.667727E-4, 0.034110248, 0.0048068734, -0.01352671, -0.012764357, 0.0071040527, 0.027080406, -0.01994262, -0.03062906, 0.002472588, -0.013465992, 0.01604315, -0.026243841, 0.008203731, -0.0145859085, 0.009775663, -0.017244026, -0.0023545246, -0.029603593, -0.041585356, -0.005285874, 0.016609855, 0.004483042, -0.01798614, -0.037186645, -0.010180452, 0.013270344, 0.023612712, -0.006662158, -0.01781073, 0.031600554, 0.015247065, -0.0049485494, -0.0072929543, -0.025380291, -0.0013535144, 0.02165623, 2.8419585E-4, -0.027768549, -0.011334102, 0.008723211, 0.039264563, 0.017554363, 0.017662307, -0.0074953493, 0.016609855, 0.0074211378, -0.008966084, 0.007927124, -0.016420955, -0.011563482, 0.008183491, -0.036188167, -0.016555883, -0.023504768, 0.0013965232, 0.0027255814, 0.022074511, 0.030710017, -0.01835045, -0.0047933804, 0.0043953373, -0.02415243, 0.011300369, 0.008136266, 0.015381995, 0.0062641148, -0.009526042, 0.008129519, -0.016366983, -0.028901959, 0.0074683633, -0.023140457, 0.016407462, 0.002857138, -0.003889351, -0.01241354, 0.010099494, -0.038185127, 0.009013309, 0.014370021, -0.008014829, 0.008008082, -0.005548987, -0.011023763, -0.036862817, -0.015233572, -0.026500208, 0.017095603, 0.0070635737, 0.004098492, -0.0125956945, -0.0063282065, 0.015935207, -0.0031101315, 0.05132729, -0.028092379, -0.0060279877, 7.7205134E-4, -0.008642253, 0.020630764, 0.02084665, 0.007697744, -0.026810547, -0.007852913, -0.016434446, -0.06044854, 0.006554214, -0.034595996, -0.0016790325, 0.015543911, 0.0320863, 0.008365646, 0.0125956945, 0.0031860294, -0.012366314, 0.014909741, -0.008210477, 0.002295493, -0.019308452, 0.020819664, -0.0088379, 0.007245729, 0.010268156, 0.0103760995, -0.008891872, 0.0019767212, -0.013762837, 0.015597883, 0.0026716096, -0.048331846, -0.0048777116, -0.027768549, 0.02442229, 0.0038522452, 0.010409832, 5.578503E-4, 0.025812067, -0.008885126, 0.010463804, -0.0029752017, 0.012737371, -0.03643104, 0.03195137, -0.041153584, 0.028767029, -0.006102199, -0.009593507, 0.01604315, 0.007576307, 0.0125822015, 0.02628432, 0.0148287825, 0.01391126, -0.011246397, -0.0014631448, 0.017594842, -0.0495732, 0.0019025098, -0.008183491, -0.014059682, -0.001480011, -0.017257519, -0.030952891, -0.04029003, -0.0021234574, 0.0046989294, -0.0148692615, 0.018890169, -0.02125144, -0.032545064, -0.0015828949, -0.011172186, 0.03443408, -0.012265117, 0.01112496, 0.021858623, 0.004590986, -0.021359384, -0.048115958, 9.1752247E-4, 0.0076640113, 0.0074143913, 0.04622694, -0.007515589, -0.0124607645, 0.0066284253, 0.0024118698, -0.0072659687, -0.026041446, 0.015220079, 0.009074028, -0.0049215634, 0.0028436452, -0.02695897, -0.0033378254, -0.0012852062, -0.018768732, -0.021669723, -0.024449276, 0.0020104537, -0.0077854483, -0.022587245, -0.029576607, 0.00442907, 0.01667732, -0.023369838, 0.017648814, -0.013304076, -0.0036903294, 0.012285356, 0.01291278, 0.031573568, -0.021291919, -0.0066419183, -0.0010785949, -0.0075223353, -0.011340848, -0.01848538, -0.020576792, 0.04884458, -0.0030544729, -0.010679692, -0.007448124, 0.021534793, 0.025636658, -0.004290767, 0.0062033967, -0.0027255814, 0.014343035, 0.003862365, 0.019915635, 0.0060144947, -0.031870414, -0.015247065, -0.026257334, -0.010328875, -0.009998296, -0.011131707, 0.006996109, -0.014167626, -0.0106459595, 0.013115174, -0.015301037, 0.013992218, 0.0015120568, 0.006871299, 0.0053567123, 0.009350633, -0.036512, 0.017324984, -0.02269519, 0.0103895925, -0.030683031, 0.001566872, 0.017163068, -0.025663644, 0.008635506, -0.03127672, -0.0025737856, -0.033381626, 0.0025467996, -0.0048372326, 0.013924753, -0.001669756, 0.04109961, -0.012420286, -0.014437486, -0.016744785, -0.020927608, 0.01771628, -6.923584E-4, 0.016731292, -0.009795902, -0.02084665, -0.0046854364, -6.19412E-4, -0.006793714, 0.0048372326, -0.025501728, -0.03575639, -0.012696892, 0.027120885, -0.013263597, -0.02768759, -0.018930648, -0.013263597, -0.011037256, 0.0351627, 0.01785121, 0.031735484, 0.023099978, 0.004692183, 0.011313862, 3.6747282E-4, -0.010160212, -0.017149575, 0.0047529014, -0.013236611, -0.021548286, -0.009053788, -0.0072389827, 0.020455355, -0.011192425, -0.020806171, -0.0026867893, -0.0018873303, -0.007866406, 0.005967269, 0.013722358, 0.016097123, -0.0031016984, 0.017122589, -0.012352821, 0.012238131, -0.007124292, -0.012312342, -0.036862817, -0.010895579, 0.0074953493, -0.010794383, 0.005629945, 0.008048561, -0.021278426, -0.027606633, 0.0031961491, 0.0025248735, 0.0035553996, 0.014302556, 0.0055962126, -0.020091044, 0.011199172, -0.015017685, 0.011158693, 0.0056603043, -0.013074695, -0.01264292, -0.006193277, -0.0037949, 0.020428369, -0.012022243, 0.004722542, 0.009040295, -0.005090226, 0.03216726, -0.040209074, -0.0061190655, 0.004182823, -0.008196984, 0.0103895925, 0.002491141, 0.002696909, 0.03089892, 0.0025602926, -0.010794383, 0.009687958, 0.015530418, -0.011900807, -0.0083049275, 0.01938941, -0.009060535, 0.027660605, 0.007792195, 3.8291597E-5, -0.015071657, 0.011813102, 0.005599586, 0.012433779, -0.031465624, 0.024597699, 0.025245361, -0.033381626, 0.003946696, 2.0935199E-4, -0.027309787, -0.01735197, -0.01343226, 0.014046189, -0.016137602, 0.007906885, -0.0042536613, 0.017176561, -0.030305227, -0.033300668, 0.0035823856, -0.0065643336, -0.008642253, 0.28777823, -0.014302556, -0.010011789, 0.017891688, -0.010288396, 0.017271012, 0.02397702, 0.009377619, 0.016272532, 0.02388257, -0.018957634, -0.012440526, -0.008716464, -0.0017692667, 0.007562814, -0.026027953, -0.018633801, -0.00492831, -0.025407277, -0.035000782, 0.003022427, -0.017621828, -0.014882755, -0.014977206, 0.03748349, -0.016987659, -0.013857288, -0.012150426, -3.0602969E-6, 0.008136266, -0.009013309, -0.008844647, -0.008675985, -0.0151796, -0.006544094, -0.0017911928, 0.017190054, 0.0085343085, 0.03850896, 0.0103356205, -0.020279946, 0.0072389827, 0.004327873, 0.030251255, -0.014181119, 0.00947207, -0.013600921, -0.020819664, -0.012892541, 0.0146938525, -0.021858623, 0.007778702, 0.028821, 0.02501598, -0.017190054, 0.0073604193, 0.040397976, -0.0048473524, -0.007913631, 0.0075493213, 0.013506471, 0.033273682, -0.026702603, 0.016893208, -0.010888833, 0.016151095, -0.031708498, -0.0106594525, -0.013337809, -0.034595996, -0.0022162215, -0.013331062, -0.016259039, 0.0041322242, -0.012109947, -0.015206586, 0.019564819, 7.5349846E-4, 1.8247144E-4, 0.046065025, -0.012224638, -0.00126581, 4.326186E-4, -0.0070635737, -0.016164588, -0.028928945, 0.0077584623, -0.006517108, -0.0033951704, 0.013553696, 0.012791343, -0.038401015, 0.019038592, -0.018552843, 0.022263413, 0.011482524, -0.030332213, 0.033786416, -0.012292103, 0.02528584, -0.021062538, -0.034353122, 0.042313978, 0.022749161, -0.0057109026, -0.0146398805, 0.0014445919, 0.009040295, 0.020225974, -0.027566154, -0.015152614, -0.029630579, 0.0037341816, 0.011941285, -0.0106324665, -0.0053297263, 0.007684251, -0.021899102, -0.014370021, -0.0046584504, 0.044337925, -0.011496017, 0.0019902142, 0.00874345, -0.015530418, -0.017824223, -0.017635321, 0.006075213, 0.02411195, -0.043933135, 0.015408981, -0.027201843, 0.030143313, 0.025906518, 2.4160864E-4, -0.005714276, 0.028470183, -0.032895878, -0.022870596, 0.002767747, 0.012035736, -0.03003537, 0.01508515, 0.017419435, 0.017594842, -0.014491458, 0.019429889, 0.0054882686, 0.041288514, -0.0017001152, -0.036350083, 0.0042435415, 0.017905181, 0.0032045823, -0.0040344005, -0.015489939, -0.0401551, -0.026675617, 8.175901E-4, 0.018971127, -0.046388857, -6.396515E-4, 0.040802766, -0.021777665, -0.010011789, 0.007657265, -0.17238629, 0.009890352, 0.0318974, 9.394486E-4, 0.014882755, 0.011968271, -0.0056872903, 0.010092747, -0.0024827078, -0.004590986, 0.03416422, 0.0013299016, -0.019753719, -0.023491275, -0.0057210224, -0.0061224387, -0.019915635, 0.0018940767, 0.023329359, -0.00607184, 0.04123454, -0.028794015, 0.010032029, 0.0029009902, 0.0147208385, 0.013742598, 0.013358048, 0.031006863, -0.0011848521, -0.0042435415, -0.01087534, 0.0148017965, 0.038616903, 0.021804651, 0.024030993, -0.009114507, -0.0032754203, -5.1905797E-4, -0.019767212, -2.1346312E-5, 0.007825927, 0.030979877, 0.015435967, 0.014046189, -0.025798574, 0.0416933, 0.022897583, 0.005191423, -0.0028588246, 0.0024169297, 0.007711237, -0.019834677, 0.017702786, 2.9178563E-4, 0.031249737, -0.0025737856, 9.411352E-4, 0.02224992, -0.0104975365, 0.015867742, -0.01112496, -0.0085747875, 0.0016183141, -0.0021588765, 0.011543242, -0.028011423, -0.01454543, -0.006034734, -0.019497354, 0.010787636, -0.01621856, -0.03521667, 0.029279761, -0.02601446, 0.018822704, 0.011165439, -0.018768732, 0.015867742, 0.011327355, 4.0563266E-4, -0.018256, 0.028605113, -0.0063585658, 0.015004192, 0.011657933, 0.011199172, -0.0027761802, -0.0041322242, 0.006267488, -0.033273682, 0.040397976, -0.036592957, 0.011590468, -0.0037780337, -0.011273383, 0.018890169, 0.008379139, 0.0103356205, 0.02542077, -0.015611376, -0.015516925, -0.0031641033, -0.0022971795, 0.02990044, 0.007906885, 0.013715612, 0.026446236, 0.02424688, 0.014370021, -0.004442563, 0.001862031, -0.010409832, 0.016717799, 0.020495834, 0.023774628, 0.010139973, -0.022236427, 0.0086287595, 0.004995775, -0.0040074144, -0.016029658, -0.013223118, -0.013654893, 0.003250121, -0.0050025214, -0.033516556, -0.06973171, -0.033786416, 0.022776147, 0.02569063, -0.006129185, 0.014127147, 0.012244877, 0.023842093, -0.011536496, 0.033948332, -0.0082509555, -0.016286025, 2.1030071E-4, 0.0026463103, -0.00506324, -0.012366314, 0.027336773, -0.022803131, -0.0085882805, 0.018309971, 0.0050564934, 0.006679024, 0.0054106843, -0.0058998046, 0.010463804, -0.0055928393, -0.018552843, 0.043285474, 0.004709049, 0.015827263, 0.0148017965, -0.018229013, -1.6486732E-4, 3.695811E-4, 0.0015921714, 0.0103626065, -0.024368318, -0.019996593, 0.023086485, -0.032653008, 0.0035857589, 0.0018772106, -0.013331062, -0.023099978, -0.030278241, 2.6922708E-4, 0.0020188869, 0.025029473, -0.016488418, -0.0059976284, -0.018863183, 0.005643438, 0.0066958903, 0.008055308, 0.02828128, -0.018971127, 0.014248584, 0.004577493, -0.011846835, 0.005785114, 0.019605298, -3.383786E-4, -0.01635349, 0.025299333, 9.4197853E-4, -0.015935207, -0.00885814, -0.0074008983, -0.0013079755, -0.019834677, -0.010531269, 0.044202995, 0.008075547, 0.009951071, -0.022830117, -0.008777182, 0.0026952224, -0.023963528, 0.003017367, -0.043231502, -0.017176561, -0.024179416, 0.010720171, -0.0071040527, 0.006227009, 0.0025653525, 0.008682732, -0.018620308, 0.021831637, -0.02895593, -0.027485196, 0.0053061135, 0.031519596, -0.012872301, -0.012420286, 0.0032433746, 0.004445936, -0.024665164, 0.014923234, -0.011017016, 0.011698412, -0.03276095, -0.0482239, 0.027903479, 0.0084533505, -0.008129519, 0.027336773, 0.014477965, 0.019402903, -0.022303892, 0.006253995, -0.010598734, -0.028821, 0.0028672577, 0.0037105689, -0.0013467679, -0.037726365, -0.044958603, 0.013729105, 0.0041187312, 0.016501911, 0.02828128, 0.009735184, 0.0105380155, 0.0145859085, -0.0031016984, -0.004897951, -0.017567856, -0.013250104, -8.8547665E-4, -0.008662492, 0.011691665, 0.024773108, -0.023437303, -0.0051104655, 0.010828115, -0.01735197, -0.031330694, 0.027876493, 0.003973682, 0.0054241773, -0.0023376583, -0.030817961, -0.026176376, 0.015827263, 0.005272381, -0.029630579, 0.0146398805, -0.035378587, -0.0021858625, 0.060286626, -0.0040883725, 0.0040782527, 0.017999632, -0.0036734631, -0.008878379, -0.007751716, -0.027741563, 0.018741746, -0.009330394, 0.021103017, -0.032005344, 0.017190054, 0.013256851, 0.009316901, -0.0034120367, 0.030251255, -0.0021285173, 0.017203547, 0.0059132976, -5.498388E-4, -0.022506287, 0.009863366, 0.020752199, 0.018498873, 0.022857103, 0.027579647, -0.008999816, 0.026270827, -0.006942137, -6.232069E-4, 0.022870596, 0.0044661756, -0.008797421, -0.013081442, -0.006544094, 0.020779185, 0.018552843, -0.009478817, 0.029873453, -0.008891872, 0.021224454, -0.0025737856, 0.028497169, -0.03643104, -0.002870631, 0.005903178, 0.054187797, -0.0050463737, -0.011219411, 0.028362239, 0.0199831, 0.0024540352, -0.0023325984, -0.0035790124, -0.0084398575, -0.01708211, 0.013573935, -0.01012648, -5.5321207E-4, -0.025987474, 0.010032029, 0.008622013, -0.009384366, 0.0073874053, 0.031006863, 0.00491819, -0.008190238, -0.018539352, -0.030278241, -0.0126091875, -0.0060414807, 0.026581166, 0.011718651, 0.01076065, -0.018282985, 0.023531754, 0.0038050197, 0.011239651, -0.012332582, 0.015570897, 0.023734149, -0.0017287878, -0.013209625, -0.018417915, -0.022438822, -0.005586093, -0.0032956598, -0.007987843, 0.03570242, -0.03783431, 0.019052085, -0.017109096, -0.004843979, 0.01898462, -0.007859659, 2.0534625E-4, -0.009694705, -0.012872301, -0.026108911, -0.009121253, -0.007846166, -0.033138752, 0.0021200841, 0.0017827597, -0.03975031, 0.012346075, -2.6754045E-4, 0.01907907, -0.009451831, -0.01150951, 0.030116327, -0.0021892355, 0.018998113, 0.02768759, -5.869445E-4, -0.0037173154, 0.00935738, 0.010726918, -0.0062573683, -0.008662492, 0.026027953, 0.014221598, -0.006611559, -0.028038409, 0.021602258, 0.010382846, -0.0030072473, 0.0050328807, 0.025542207, -8.2981813E-4, -0.0056333183, 0.02976551, -0.012750864, -0.02038789, 0.034865856, -0.013047709, 2.5362583E-4, -0.00390959, -0.019092564]
16:33:36.203 [main] INFO cn.jdl.tech_and_data.ka.OpenAIEmbeddingTest - 它是1536維的向量
3.2.5 向量庫存儲
向量資料庫,也稱為向量存儲或向量搜尋引擎,是一種專門設計用于存儲和管理向量(固定長度的數字清單)及其他資料項的資料庫。
這些向量是資料點在高維空間中的數學表示,其中每個次元對應資料的一個特征。向量資料庫的主要目的是通過近似最近鄰(ANN)算法實作高效的相似性搜尋。
在使用向量庫前,需要先啟動chromdb,再通過LangChain4j封裝的SDK連接配接到向量庫,并建立資料存儲容器,即集合(Collection)中(相當于MySQL的表)
Client client = new Client(CHROMA_URL);
EmbeddingFunction embeddingFunction = new OpenAIEmbeddingFunction(API_KEY, OPEN_AI_MODULE_NAME);
client.createCollection(CHROMA_DB_DEFAULT_COLLECTION_NAME,null,true, embeddingFunction);
嵌入完成後,通過SDK連接配接到向量庫,将向量(Embedding)與文本段(TextSegment)綁定,一并存儲到向量庫中。
EmbeddingStore<TextSegment> embeddingStore = ChromaEmbeddingStore.builder().baseUrl(CHROMA_URL).collectionName(CHROMA_DB_DEFAULT_COLLECTION_NAME).build();
segments.forEach(segment->{
Embedding embedding = embeddingModel.embed(segment).content();
embeddingStore.add(embedding, segment);
});
3.2.6 向量庫檢索
為了在向量庫中查詢到相似的知識片段,作為查詢的文本也需要進行向量化,方法同上。
Embedding queryEmbedding = embeddingModel.embed(qryText).content();
檢索時,向量庫通過ANN算法查找與查詢向量(queryEmbedding)距離最近的若個(取決于查詢入參)個文本片段。
EmbeddingSearchRequest embeddingSearchRequest = EmbeddingSearchRequest.builder().queryEmbedding(queryEmbedding).maxResults(1).build();
EmbeddingSearchResult<TextSegment> embeddedEmbeddingSearchResult = embeddingStore.search(embeddingSearchRequest);
List<EmbeddingMatch<TextSegment>> embeddingMatcheList = embeddedEmbeddingSearchResult.matches();
EmbeddingMatch<TextSegment> embeddingMatch = embeddingMatcheList.get(0);
TextSegment TextSegment = embeddingMatch.embedded();
查詢方法有4個入參:查詢文本嵌入向量(queryEmbedding)、最大查詢數量(最多查詢多少個距離最近的向量)、最小分值(通過該值過濾一些候選值)、中繼資料過濾器(根據中繼資料過濾一些候選值)
3.2.7 與LLM互動
定義Prompt模闆,告知LLM通過給定的上下文知識(context,即:查詢向量庫擷取到的相關知識)回答提出的問題(question):
基于如下資訊進行回答:\n{{context}}\n提問:\n{{question}}
PromptTemplate promptTemplate = PromptTemplate.from("基于如下資訊進行回答:\n" +
"{{context}}\n" +
"提問:\n" +
"{{question}}");
Map<String, Object> variables = new HashMap<>();
variables.put("context", textSegment.text());
variables.put("question", QUESTION);
prompt = promptTemplate.apply(variables);
将Prompt資訊組合到請求LLM的入參中:
OpenAiChatModel openAiChatModel = OpenAiChatModel.builder().apiKey(API_KEY).baseUrl(BASE_URL).modelName(OPEN_AI_MODULE_NAME).temperature(TEMPERATURE_NO_RANDOM).build();
UserMessage userMessage = prompt.toUserMessage();
Response<AiMessage> aiMessageResponse = openAiChatModel.generate(userMessage);
String response = aiMessageResponse.content();
3.3 測試驗證
17:47:57.060 [main] INFO cn.jdl.tech_and_data.ka.Utils - 目前的模型是:gpt-4o
17:47:57.067 [main] INFO cn.jdl.tech_and_data.ka.RagChat - [without RAG]userMessage=UserMessage { name = null contents = [TextContent { text = "請給我講一個關于冰淇淋的笑話" }] }
17:48:00.129 [main] INFO cn.jdl.tech_and_data.ka.RagChat - 不使用RAG,直接詢問AI:[請給我講一個關于冰淇淋的笑話]
得到的回答是:[當然!這裡有一個關于冰淇淋的笑話,希望你會喜歡:
為什麼冰淇淋總是很開心?
因為它有很多“甜蜜”的朋友!]
17:48:00.269 [main] INFO cn.jdl.tech_and_data.ka.RagChat - version of chroma db is : 0.5.0
17:48:02.550 [main] INFO cn.jdl.tech_and_data.ka.RagChat - text:聽說夏天和冰淇淋更配哦?不存在的,冰淇淋在我嘴裡早就化了,連味道都沒來得及品嘗!, embeddingId=bd89a581-e764-47fc-967c-68899d6be904
17:48:03.811 [main] INFO cn.jdl.tech_and_data.ka.RagChat - text:有一次,一個女孩因為太想要一支冰激淩而得不到,結果她竟然在超市的冰激淩貨架前大聲呼喚:“天啊,我渴望你!你能把冰激淩賜給我嗎?”你猜怎麼着?她得到了那支冰激淩,但是當她回家後發現裡面全是空的!, embeddingId=6f84ab62-635f-4bd0-a3d2-ff55d00c4b01
17:48:05.534 [main] INFO cn.jdl.tech_and_data.ka.RagChat - text:冰淇淋的冷,你絕對想象不到,它不僅冷靜而且淡定的在我嘴裡說着:“我是你的心頭痛。”, embeddingId=c9d15213-2146-4f7b-8b23-c95d858b4ded
17:48:06.843 [main] INFO cn.jdl.tech_and_data.ka.RagChat - text:半價冰淇淋媽媽帶着小明去買冰淇淋,小明興緻勃勃地挑選了一個自己喜歡的口味。媽媽結賬時,發現價格是雙倍的,于是問店員為什麼。店員說:“這個冰淇淋是兩種口味的雙球冰淇淋。” 小明聽了,恍然大悟:“難怪它長了兩隻眼睛!”, embeddingId=042ee87a-bc0c-4ba2-aea5-dbef45c32d33
17:48:08.452 [main] INFO cn.jdl.tech_and_data.ka.RagChat - text:有一次我家的冰箱壞了,裡面的冰激淩都開始融化。我看到了一個很棒的創意——把所有的冰激淩球放在一個碗裡,然後放入冰箱冷凍。結果我得到了一個巨大的冰激淩球混合物,這可是一道奇特的甜點!
麒麟飛到北極會變成什麼? 答案:冰淇淋。, embeddingId=e7436694-821e-4559-89d3-2f09be2e172a
17:48:08.453 [main] INFO cn.jdl.tech_and_data.ka.RagChat - 通過關鍵詞:【兩隻眼睛】查詢向量庫
17:48:10.252 [main] INFO cn.jdl.tech_and_data.ka.RagChat - 查詢到的文檔資訊是 matchedTextSegment: TextSegment { text = "半價冰淇淋媽媽帶着小明去買冰淇淋,小明興緻勃勃地挑選了一個自己喜歡的口味。媽媽結賬時,發現價格是雙倍的,于是問店員為什麼。店員說:“這個冰淇淋是兩種口味的雙球冰淇淋。” 小明聽了,恍然大悟:“難怪它長了兩隻眼睛!”" metadata = {absolute_directory_path=/Users/qihaizhi/newJavaEngineerOrientation/LangChain/core/target/classes, file_name=笑話.txt, index=3} }
17:48:10.258 [main] INFO cn.jdl.tech_and_data.ka.RagChat - [with RAG]userMessage=UserMessage { name = null contents = [TextContent { text = "請基于如下資訊進行回答,盡量使用資訊中的所有詞語:
半價冰淇淋媽媽帶着小明去買冰淇淋,小明興緻勃勃地挑選了一個自己喜歡的口味。媽媽結賬時,發現價格是雙倍的,于是問店員為什麼。店員說:“這個冰淇淋是兩種口味的雙球冰淇淋。” 小明聽了,恍然大悟:“難怪它長了兩隻眼睛!”
提問:
請給我講一個關于冰淇淋的笑話" }] }
17:48:16.663 [main] INFO cn.jdl.tech_and_data.ka.RagChat - 使用RAG,詢問AI:[請給我講一個關于冰淇淋的笑話],
得到的回答是:[有一天,媽媽帶着小明去買冰淇淋。小明興緻勃勃地挑選了一個自己喜歡的口味。媽媽結賬時,發現價格是雙倍的,于是問店員為什麼。店員說:“這個冰淇淋是兩種口味的雙球冰淇淋。” 小明聽了,恍然大悟:“難怪它長了兩隻眼睛!”
這個笑話是不是很有趣呢?冰淇淋不僅美味,還能帶來歡樂!]
4、總結與展望
通過本文的實戰示例,詳細介紹了如何使用LangChain4j架構來實作基于RAG技術的本地大模型應用。
通過不斷探索和優化,RAG技術在大模型應用中的潛力将得到更充分的發揮,為各種業務場景提供更智能和高效的解決方案。
希望本文能為Java工程師們提供一個清晰的實戰指南,幫助大家在大模型開發的道路上走得更遠。