我猜,大家最大的疑問就是:不是已經有那麼多query實作類嗎,為什麼又設計一個functionquery,它的設計初衷是什麼,或者說它是用來解決什麼問題的?我們還是來看看源碼裡是怎麼解釋functionquery的:
![](https://img.laitimes.com/img/_0nNw4CM6IyYiwiM6ICdiwiInBnauI2Y2QTOxATN4MzY10iNklTYtMTNjNTL1cjM30SN1YmNmVmYy8CX4QjN08CX3ATMw8CX05WZth2YhRHdh9CXkF2bsBXdvwVbvNmLllXZ0lmLywGZvw1LcpDc0RHaiojIsJye.jpg)
意思就是基于valuesource來傳回每個文檔的評分即valuesourcescore,那valuesource又是怎麼東東?接着看看valuesource源碼裡的注釋說明:
valuesource是用來根據指定的indexreader來執行個體化functionvalues的,那functionvalues又是啥?
從接口中定義的函數可以了解到,functionvalues提供了根據文檔id擷取各種類型的docvaluesfield域的值的方法,那這些接口傳回的域值用來幹嘛的,翻看functionquery源碼,你會發現:
從上面幾張圖,我們會發現,functionquery構造的時候需要提供一個valuesource,然後在functionquery的内部類allscorer中通過valuesource執行個體化了functionvalues,然後在計算functionquery評分的時候通過functionvalues擷取docvaluesfield的域值,域值和functionquery的權重值相乘得到functionquery的評分。
float score = qweight * vals.floatval(doc);
那這裡valuesource又起什麼作用呢,為什麼不直接讓functionquery來建構functionvalues,而是要引入一個中間角色valuesource呢?
因為functionquery應該線程安全的,即允許多次查詢共用同一個functionquery執行個體,如果讓functionvalues直接依賴functionquery,那可能會導緻某個線程通過functionvalues得到的docvaluesfield域值被另一個線程修改了,是以引入了一個valuessource,讓每個functionquery對應一個valuesource,再讓valuesource去生成functionvalues,因為docvaluesfield域值的正确性會影響到最後的評分。另外出于緩存原因,因為每次通過functionvalues去加載docvaluesfield的域值,其實還是通過indexreader去讀取的,這就意味着有磁盤io行為,磁盤io次數可是程式性能殺手哦,是以設計cachingdoublevaluesource來包裝valuesource.不過cachingdoublevaluesource貌似還處在捐獻子產品,不知道下個版本是否會考慮為valuesource添加cache功能。
valuesource構造很簡單,
public doublefieldsource(string field) {
super(field);
}
你隻需要提供一個域的名稱即可,不過要注意,這裡的域必須是docvaluesfield,不能是普通的stringfield,textfield,intfield,floatfield,longfield。
那functionquery可以用來解決什麼問題?舉個例子:比如你索引了n件商品,你希望通過某個關鍵字搜尋時,出來的結果優先按最近上架的商品顯示,再按商品和搜尋關鍵字比對度高低降序顯示,即你希望最近上架的優先靠前顯示,評分高的靠前顯示。
下面是一個functionquery使用示例,模拟類似這樣的場景:
書籍的出版日期越久遠,其權重因子會按天數一天天衰減,進而實作讓新書自動靠前顯示
package com.yida.framework.lucene5.function;
import java.io.ioexception;
import java.util.map;
import org.apache.lucene.index.docvalues;
import org.apache.lucene.index.leafreadercontext;
import org.apache.lucene.index.numericdocvalues;
import org.apache.lucene.queries.function.functionvalues;
import org.apache.lucene.queries.function.valuesource.fieldcachesource;
import com.yida.framework.lucene5.util.score.scoreutils;
/**
* 自定義valuesource[計算日期遞減時的權重因子,日期越近權重值越高]
* @author lanxiaowei
*
*/
public class datedampingvaluesouce extends fieldcachesource {
//目前時間
private static long now;
public datedampingvaluesouce(string field) {
super(field);
//初始化目前時間
now = system.currenttimemillis();
}
/**
* 這裡map裡存的是indexseacher,context.get("searcher");擷取
*/
@override
public functionvalues getvalues(map context, leafreadercontext leafreadercontext)
throws ioexception {
final numericdocvalues numericdocvalues = docvalues.getnumeric(leafreadercontext.reader(), field);
return new functionvalues() {
@override
public float floatval(int doc) {
return scoreutils.getnewsscorefactor(now, numericdocvalues,doc);
}
public int intval(int doc) {
return (int) scoreutils.getnewsscorefactor(now, numericdocvalues,doc);
public string tostring(int doc) {
return description() + '=' + intval(doc);
};
}
package com.yida.framework.lucene5.util.score;
import com.yida.framework.lucene5.util.constans;
* 計算衰減因子[按天為機關]
public class scoreutils {
/**存儲衰減因子-按天為機關*/
private static float[] daysdampingfactor = new float[120];
/**降級閥值*/
private static float demoteboost = 0.9f;
static {
daysdampingfactor[0] = 1;
//第一周時權重降級處理
for (int i = 1; i < 7; i++) {
daysdampingfactor[i] = daysdampingfactor[i - 1] * demoteboost;
}
//第二周
for (int i = 7; i < 31; i++) {
daysdampingfactor[i] = daysdampingfactor[i / 7 * 7 - 1]
* demoteboost;
//第三周以後
for (int i = 31; i < daysdampingfactor.length; i++) {
daysdampingfactor[i] = daysdampingfactor[i / 31 * 31 - 1]
//根據相差天數擷取目前的權重衰減因子
private static float daydamping(int delta) {
float factor = delta < daysdampingfactor.length ? daysdampingfactor[delta]
: daysdampingfactor[daysdampingfactor.length - 1];
system.out.println("delta:" + delta + "-->" + "factor:" + factor);
return factor;
public static float getnewsscorefactor(long now, numericdocvalues numericdocvalues, int docid) {
long time = numericdocvalues.get(docid);
float factor = 1;
int day = (int) (time / constans.day_millis);
int nowday = (int) (now / constans.day_millis);
system.out.println(day + ":" + nowday + ":" + (nowday - day));
// 如果提供的日期比目前日期小,則計算相差天數,傳入daydamping計算日期衰減因子
if (day < nowday) {
factor = daydamping(nowday - day);
} else if (day > nowday) {
//如果提供的日期比目前日期還大即提供的是未來的日期
factor = float.min_value;
} else if (now - time <= constans.half_hour_millis && now >= time) {
//如果兩者是同一天且提供的日期是過去半小時之内的,則權重因子乘以2
factor = 2;
public static float getnewsscorefactor(long now, long time) {
public static float getnewsscorefactor(long time) {
long now = system.currenttimemillis();
return getnewsscorefactor(now, time);
import java.nio.file.paths;
import java.text.dateformat;
import java.text.parseexception;
import java.text.simpledateformat;
import java.util.date;
import org.apache.lucene.analysis.analyzer;
import org.apache.lucene.analysis.standard.standardanalyzer;
import org.apache.lucene.document.document;
import org.apache.lucene.document.field;
import org.apache.lucene.document.field.store;
import org.apache.lucene.document.longfield;
import org.apache.lucene.document.numericdocvaluesfield;
import org.apache.lucene.document.textfield;
import org.apache.lucene.index.directoryreader;
import org.apache.lucene.index.indexreader;
import org.apache.lucene.index.indexwriter;
import org.apache.lucene.index.indexwriterconfig;
import org.apache.lucene.index.indexwriterconfig.openmode;
import org.apache.lucene.index.term;
import org.apache.lucene.queries.customscorequery;
import org.apache.lucene.queries.function.functionquery;
import org.apache.lucene.search.indexsearcher;
import org.apache.lucene.search.scoredoc;
import org.apache.lucene.search.sort;
import org.apache.lucene.search.sortfield;
import org.apache.lucene.search.termquery;
import org.apache.lucene.search.topdocs;
import org.apache.lucene.store.directory;
import org.apache.lucene.store.fsdirectory;
* functionquery測試
public class functionquerytest {
private static final dateformat formate = new simpledateformat("yyyy-mm-dd");
public static void main(string[] args) throws exception {
string indexdir = "c:/lucenedir-functionquery";
directory directory = fsdirectory.open(paths.get(indexdir));
//system.out.println(0.001953125f * 100000000 * 0.001953125f / 100000000);
//建立測試索引[注意:隻用建立一次,第二次運作前請注釋掉這行代碼]
//createindex(directory);
indexreader reader = directoryreader.open(directory);
indexsearcher searcher = new indexsearcher(reader);
//建立一個普通的termquery
termquery termquery = new termquery(new term("title", "solr"));
//根據可以計算日期衰減因子的自定義valuesource來建立functionquery
functionquery functionquery = new functionquery(new datedampingvaluesouce("publishdate"));
//自定義評分查詢[customscorequery将普通query和functionquery組合在一起,至于兩者的query評分按什麼算法計算得到最後得分,由使用者自己去重寫來幹預評分]
//預設實作是把普通查詢評分和functionquery進階查詢評分相乘求積得到最終得分,你可以自己重寫預設的實作
customscorequery customscorequery = new customscorequery(termquery, functionquery);
//建立排序器[按評分降序排序]
sort sort = new sort(new sortfield[] {sortfield.field_score});
topdocs topdocs = searcher.search(customscorequery, null, integer.max_value, sort,true,false);
scoredoc[] docs = topdocs.scoredocs;
for (scoredoc scoredoc : docs) {
int docid = scoredoc.doc;
document document = searcher.doc(docid);
string title = document.get("title");
string publishdatestring = document.get("publishdate");
system.out.println(publishdatestring);
long publishmills = long.valueof(publishdatestring);
date date = new date(publishmills);
publishdatestring = formate.format(date);
float score = scoredoc.score;
system.out.println(docid + " " + title + " " +
publishdatestring + " " + score);
reader.close();
directory.close();
* 建立document對象
* @param title 書名
* @param publishdatestring 書籍出版日期
* @return
* @throws parseexception
public static document createdocument(string title,string publishdatestring) throws parseexception {
date publishdate = formate.parse(publishdatestring);
document doc = new document();
doc.add(new textfield("title",title,field.store.yes));
doc.add(new longfield("publishdate", publishdate.gettime(),store.yes));
doc.add(new numericdocvaluesfield("publishdate", publishdate.gettime()));
return doc;
//建立測試索引
public static void createindex(directory directory) throws parseexception, ioexception {
analyzer analyzer = new standardanalyzer();
indexwriterconfig indexwriterconfig = new indexwriterconfig(analyzer);
indexwriterconfig.setopenmode(openmode.create_or_append);
indexwriter writer = new indexwriter(directory, indexwriterconfig);
//建立測試索引
document doc1 = createdocument("lucene in action 2th edition", "2010-05-05");
document doc2 = createdocument("lucene progamming", "2008-07-11");
document doc3 = createdocument("lucene user guide", "2014-11-24");
document doc4 = createdocument("lucene5 cookbook", "2015-01-09");
document doc5 = createdocument("apache lucene api 5.0.0", "2015-02-25");
document doc6 = createdocument("apache solr 4 cookbook", "2013-10-22");
document doc7 = createdocument("administrating solr", "2015-01-20");
document doc8 = createdocument("apache solr essentials", "2013-08-16");
document doc9 = createdocument("apache solr high performance", "2014-06-28");
document doc10 = createdocument("apache solr api 5.0.0", "2015-03-02");
writer.adddocument(doc1);
writer.adddocument(doc2);
writer.adddocument(doc3);
writer.adddocument(doc4);
writer.adddocument(doc5);
writer.adddocument(doc6);
writer.adddocument(doc7);
writer.adddocument(doc8);
writer.adddocument(doc9);
writer.adddocument(doc10);
writer.close();
運作測試結果如圖:
demo代碼請在最底下的附件裡下載下傳如果你需要的話,ok,打完收工!
如果你還有什麼問題請加我Q-q:7-3-6-0-3-1-3-0-5,
或者加裙
一起交流學習!
轉載:http://iamyida.iteye.com/blog/2201291