資料庫檢索效率時，一般首要優化途徑是從索引入手，然後根據需求再考慮更複雜的負載均衡、讀寫分離和分布式水準/垂直分庫/表等手段；

索引通過資訊備援來提高檢索效率，其以空間換時間并會降低資料寫入的效率；是以對索引字段的選擇非常重要。

Neo4j可對指定Label的Node Create Index，當新增/更新符合條件的Node屬性時，Index會自動更新。Neo4j Index預設采用Lucene實作（可定制，如Spatial Index自定義實作的RTree索引），但預設建立的索引隻支援精确比對（get），模糊查詢（query）的話需要以全文索引，控制Lucene背景的分詞行為。
Neo4j全文索引預設的分詞器是針對西方語種的，如預設的exact查詢采用的是lucene KeywordAnalyzer（關鍵詞分詞器）,fulltext查詢采用的是 white-space tokenizer（空格分詞器），大小寫什麼的對中文沒啥意義；是以針對中文分詞需要挂一個中文分詞器，如IK Analyzer,Ansj，至于類似梁廠長家的基于深度學習的分詞系統pullword，那就更厲害啦。

本文以常用的IK Analyzer分詞器為例，介紹如何在Neo4j中對字段建立全文索引實作模糊查詢。

IKAnalyzer分詞器

IKAnalyzer是一個開源的，基于java語言開發的輕量級的中文分詞工具包。

IKAnalyzer3.0特性:

采用了特有的“正向疊代最細粒度切分算法“，支援細粒度和最大詞長兩種切分模式；具有83萬字/秒（1600KB/S）的高速處理能力。
采用了多子處理器分析模式，支援：英文字母、數字、中文詞彙等分詞處理，相容韓文、日文字元優化的詞典存儲，更小的記憶體占用。支援使用者詞典擴充定義
針對Lucene全文檢索優化的查詢分析器IKQueryParser(作者吐血推薦)；引入簡單搜尋表達式，采用歧義分析算法優化查詢關鍵字的搜尋排列組合，能極大的提高Lucene檢索的命中率。

IK Analyser目前還沒有maven庫，還得自己手動下載下傳install到本地庫，下次空了自己在github做一個maven私有庫，上傳這些maven central庫裡面沒有的工具包。

IKAnalyzer自定義使用者詞典

詞典檔案

自定義詞典字尾名為.dic的詞典檔案，必須使用無BOM的UTF-8編碼儲存的檔案。

詞典配置

詞典和IKAnalyzer.cfg.xml配置檔案的路徑問題，IKAnalyzer.cfg.xml必須在src根目錄下。詞典可以任意放，但是在IKAnalyzer.cfg.xml裡要配置對。如下這種配置，ext.dic和stopword.dic應當在同一目錄下。

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">

<comment>IK Analyzer 擴充配置</comment>

<entry key="ext_stopwords">/stopword.dic</entry>

</properties>

Neo4j全文索引建構

指定IKAnalyzer作為luncene分詞的analyzer，并對所有Node的指定屬性建立全文索引

[@Override](/user/Override)

public void createAddressNodeFullTextIndex () {

try (Transaction tx = graphDBService.beginTx()) {

IndexManager index = graphDBService.index();

Index<Node> addressNodeFullTextIndex =

index.forNodes( "addressNodeFullTextIndex", MapUtil.stringMap(IndexManager.PROVIDER, "lucene", "analyzer", IKAnalyzer.class.getName()));

ResourceIterator<Node> nodes = graphDBService.findNodes(DynamicLabel.label( "AddressNode"));

while (nodes.hasNext()) {

Node node = nodes.next();

//對text字段建立全文索引

Object text = node.getProperty( "text", null);

addressNodeFullTextIndex.add(node, "text", text);

}

tx.success();

}

Neo4j全文索引測試

對關鍵詞（如’有限公司’），多關鍵詞模糊查詢（如’蘇州教育公司’）預設都能檢索，且檢索結果按關聯度已排好序。

package uadb.tr.neodao.test;

import org.junit.Test;

import org.junit.runner.RunWith;

import org.neo4j.graphdb.GraphDatabaseService;

import org.neo4j.graphdb.Node;

import org.neo4j.graphdb.Transaction;

import org.neo4j.graphdb.index.Index;

import org.neo4j.graphdb.index.IndexHits;

import org.neo4j.graphdb.index.IndexManager;

import org.neo4j.helpers.collection.MapUtil;

import org.springframework.beans.factory.annotation.Autowired;

import org.springframework.test.context.ContextConfiguration;

import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;

import org.wltea.analyzer.lucene.IKAnalyzer;

import com.lt.uadb.tr.entity.adtree.AddressNode;

import com.lt.util.serialize.JsonUtil;

/**

* AddressNodeNeoDaoTest

* [@author](/user/author) geosmart

@RunWith(SpringJUnit4ClassRunner. class)

@ContextConfiguration(locations = { "classpath:app.neo4j.cfg.xml" })

public class AddressNodeNeoDaoTest {

[@Autowired](/user/Autowired)

GraphDatabaseService graphDBService;

[@Test](/user/Test)

public void test_selectAddressNodeByFullTextIndex() {

try (Transaction tx = graphDBService.beginTx()) {

IndexManager index = graphDBService.index();

Index<Node> addressNodeFullTextIndex = index.forNodes("addressNodeFullTextIndex" ,

MapUtil. stringMap(IndexManager.PROVIDER, "lucene", "analyzer" , IKAnalyzer.class.getName()));

IndexHits<Node> foundNodes = addressNodeFullTextIndex.query("text" , "蘇州教育公司" );

for (Node node : foundNodes) {

AddressNode entity = JsonUtil.ConvertMap2POJO(node.getAllProperties(), AddressNode. class, false, true);

System. out.println(entity.getAll位址實全稱());

}

tx.success();

}

CyperQL中使用自定義全文索引查詢

正則查詢

profile

match (a:AddressNode{ruleabbr:'TOW',text:'唯亭鎮'})<-[r:BELONGTO]-(b:AddressNode{ruleabbr:'STR'})

where b.text=~ '金陵.*'

return a,b

全文索引查詢

profile

START b=node:addressNodeFullTextIndex("text:金陵*")

match (a:AddressNode{ruleabbr:'TOW',text:'唯亭鎮'})<-[r:BELONGTO]-(b:AddressNode)

where b.ruleabbr='STR'

LegacyIndex中建立聯合exact和fulltext索引

對label為AddressNode的節點，根據節點屬性ruleabbr的分類addressnode_fulltext_index（省->市->區縣->鄉鎮街道->街路巷/物業小區）/addressnode_exact_index(門牌号->樓幢号->單元号->層号->戶室号)，對屬性text分别建不同類型的索引

START a=node:addressnode_fulltext_index("text:商業街"),b=node:addressnode_exact_index("text:二期19")

match (a:AddressNode{ruleabbr:'STR'})-[r:BELONGTO]-(b:AddressNode{ruleabbr:'TAB'})

return a,b limit 10

原文位址：http://neo4j.com.cn/topic/58184ea2cdf6c5bf145675c3

Neo4j中實作自定義中文全文索引

IKAnalyzer分詞器

IKAnalyzer自定義使用者詞典

Neo4j全文索引建構

Neo4j全文索引測試

CyperQL中使用自定義全文索引查詢

正則查詢

全文索引查詢

LegacyIndex中建立聯合exact和fulltext索引

繼續閱讀

redis叢集資料一緻性_RedisRaft為Redis叢集帶來強大的資料一緻性

JAVA高效程式設計指南

寶塔面闆mysql恢複2018.1.8更新

Centos7 MySQL 5.7 安裝MySQL 5.7 安裝

windows不能在本地計算機上運作oracleDbConsoleorcl

查找入職員工時間排名倒數第三的員工所有資訊

Hibernate使用Hibernate的“3個準備，7個步驟”Hibernate API簡介操作實體對象對象識别

雲計算面試題——mysql/存儲引擎/備份

關于SQL語言

SQL語言基礎：常用的資料查詢語句

Ubuntu16.04安裝Apache+MySQL+PHP1. 安裝Apache2. 安裝MySQL3. 安裝PHP4. 安裝phpMyAdmin

MySQL的4種隔離級别？出現問題

neo4j之cypher使用文檔

mysql使用source指令導入.sql檔案

sqlServer根據經緯查距離

Oracle 批量查詢傳入List 傳回List