我猜,大家最大的疑问就是:不是已经有那么多query实现类吗,为什么又设计一个functionquery,它的设计初衷是什么,或者说它是用来解决什么问题的?我们还是来看看源码里是怎么解释functionquery的:
![](https://img.laitimes.com/img/_0nNw4CM6IyYiwiM6ICdiwiInBnauI2Y2QTOxATN4MzY10iNklTYtMTNjNTL1cjM30SN1YmNmVmYy8CX4QjN08CX3ATMw8CX05WZth2YhRHdh9CXkF2bsBXdvwVbvNmLllXZ0lmLywGZvw1LcpDc0RHaiojIsJye.jpg)
意思就是基于valuesource来返回每个文档的评分即valuesourcescore,那valuesource又是怎么东东?接着看看valuesource源码里的注释说明:
valuesource是用来根据指定的indexreader来实例化functionvalues的,那functionvalues又是啥?
从接口中定义的函数可以了解到,functionvalues提供了根据文档id获取各种类型的docvaluesfield域的值的方法,那这些接口返回的域值用来干嘛的,翻看functionquery源码,你会发现:
从上面几张图,我们会发现,functionquery构造的时候需要提供一个valuesource,然后在functionquery的内部类allscorer中通过valuesource实例化了functionvalues,然后在计算functionquery评分的时候通过functionvalues获取docvaluesfield的域值,域值和functionquery的权重值相乘得到functionquery的评分。
float score = qweight * vals.floatval(doc);
那这里valuesource又起什么作用呢,为什么不直接让functionquery来构建functionvalues,而是要引入一个中间角色valuesource呢?
因为functionquery应该线程安全的,即允许多次查询共用同一个functionquery实例,如果让functionvalues直接依赖functionquery,那可能会导致某个线程通过functionvalues得到的docvaluesfield域值被另一个线程修改了,所以引入了一个valuessource,让每个functionquery对应一个valuesource,再让valuesource去生成functionvalues,因为docvaluesfield域值的正确性会影响到最后的评分。另外出于缓存原因,因为每次通过functionvalues去加载docvaluesfield的域值,其实还是通过indexreader去读取的,这就意味着有磁盘io行为,磁盘io次数可是程序性能杀手哦,所以设计cachingdoublevaluesource来包装valuesource.不过cachingdoublevaluesource貌似还处在捐献模块,不知道下个版本是否会考虑为valuesource添加cache功能。
valuesource构造很简单,
public doublefieldsource(string field) {
super(field);
}
你只需要提供一个域的名称即可,不过要注意,这里的域必须是docvaluesfield,不能是普通的stringfield,textfield,intfield,floatfield,longfield。
那functionquery可以用来解决什么问题?举个例子:比如你索引了n件商品,你希望通过某个关键字搜索时,出来的结果优先按最近上架的商品显示,再按商品和搜索关键字匹配度高低降序显示,即你希望最近上架的优先靠前显示,评分高的靠前显示。
下面是一个functionquery使用示例,模拟类似这样的场景:
书籍的出版日期越久远,其权重因子会按天数一天天衰减,从而实现让新书自动靠前显示
package com.yida.framework.lucene5.function;
import java.io.ioexception;
import java.util.map;
import org.apache.lucene.index.docvalues;
import org.apache.lucene.index.leafreadercontext;
import org.apache.lucene.index.numericdocvalues;
import org.apache.lucene.queries.function.functionvalues;
import org.apache.lucene.queries.function.valuesource.fieldcachesource;
import com.yida.framework.lucene5.util.score.scoreutils;
/**
* 自定义valuesource[计算日期递减时的权重因子,日期越近权重值越高]
* @author lanxiaowei
*
*/
public class datedampingvaluesouce extends fieldcachesource {
//当前时间
private static long now;
public datedampingvaluesouce(string field) {
super(field);
//初始化当前时间
now = system.currenttimemillis();
}
/**
* 这里map里存的是indexseacher,context.get("searcher");获取
*/
@override
public functionvalues getvalues(map context, leafreadercontext leafreadercontext)
throws ioexception {
final numericdocvalues numericdocvalues = docvalues.getnumeric(leafreadercontext.reader(), field);
return new functionvalues() {
@override
public float floatval(int doc) {
return scoreutils.getnewsscorefactor(now, numericdocvalues,doc);
}
public int intval(int doc) {
return (int) scoreutils.getnewsscorefactor(now, numericdocvalues,doc);
public string tostring(int doc) {
return description() + '=' + intval(doc);
};
}
package com.yida.framework.lucene5.util.score;
import com.yida.framework.lucene5.util.constans;
* 计算衰减因子[按天为单位]
public class scoreutils {
/**存储衰减因子-按天为单位*/
private static float[] daysdampingfactor = new float[120];
/**降级阀值*/
private static float demoteboost = 0.9f;
static {
daysdampingfactor[0] = 1;
//第一周时权重降级处理
for (int i = 1; i < 7; i++) {
daysdampingfactor[i] = daysdampingfactor[i - 1] * demoteboost;
}
//第二周
for (int i = 7; i < 31; i++) {
daysdampingfactor[i] = daysdampingfactor[i / 7 * 7 - 1]
* demoteboost;
//第三周以后
for (int i = 31; i < daysdampingfactor.length; i++) {
daysdampingfactor[i] = daysdampingfactor[i / 31 * 31 - 1]
//根据相差天数获取当前的权重衰减因子
private static float daydamping(int delta) {
float factor = delta < daysdampingfactor.length ? daysdampingfactor[delta]
: daysdampingfactor[daysdampingfactor.length - 1];
system.out.println("delta:" + delta + "-->" + "factor:" + factor);
return factor;
public static float getnewsscorefactor(long now, numericdocvalues numericdocvalues, int docid) {
long time = numericdocvalues.get(docid);
float factor = 1;
int day = (int) (time / constans.day_millis);
int nowday = (int) (now / constans.day_millis);
system.out.println(day + ":" + nowday + ":" + (nowday - day));
// 如果提供的日期比当前日期小,则计算相差天数,传入daydamping计算日期衰减因子
if (day < nowday) {
factor = daydamping(nowday - day);
} else if (day > nowday) {
//如果提供的日期比当前日期还大即提供的是未来的日期
factor = float.min_value;
} else if (now - time <= constans.half_hour_millis && now >= time) {
//如果两者是同一天且提供的日期是过去半小时之内的,则权重因子乘以2
factor = 2;
public static float getnewsscorefactor(long now, long time) {
public static float getnewsscorefactor(long time) {
long now = system.currenttimemillis();
return getnewsscorefactor(now, time);
import java.nio.file.paths;
import java.text.dateformat;
import java.text.parseexception;
import java.text.simpledateformat;
import java.util.date;
import org.apache.lucene.analysis.analyzer;
import org.apache.lucene.analysis.standard.standardanalyzer;
import org.apache.lucene.document.document;
import org.apache.lucene.document.field;
import org.apache.lucene.document.field.store;
import org.apache.lucene.document.longfield;
import org.apache.lucene.document.numericdocvaluesfield;
import org.apache.lucene.document.textfield;
import org.apache.lucene.index.directoryreader;
import org.apache.lucene.index.indexreader;
import org.apache.lucene.index.indexwriter;
import org.apache.lucene.index.indexwriterconfig;
import org.apache.lucene.index.indexwriterconfig.openmode;
import org.apache.lucene.index.term;
import org.apache.lucene.queries.customscorequery;
import org.apache.lucene.queries.function.functionquery;
import org.apache.lucene.search.indexsearcher;
import org.apache.lucene.search.scoredoc;
import org.apache.lucene.search.sort;
import org.apache.lucene.search.sortfield;
import org.apache.lucene.search.termquery;
import org.apache.lucene.search.topdocs;
import org.apache.lucene.store.directory;
import org.apache.lucene.store.fsdirectory;
* functionquery测试
public class functionquerytest {
private static final dateformat formate = new simpledateformat("yyyy-mm-dd");
public static void main(string[] args) throws exception {
string indexdir = "c:/lucenedir-functionquery";
directory directory = fsdirectory.open(paths.get(indexdir));
//system.out.println(0.001953125f * 100000000 * 0.001953125f / 100000000);
//创建测试索引[注意:只用创建一次,第二次运行前请注释掉这行代码]
//createindex(directory);
indexreader reader = directoryreader.open(directory);
indexsearcher searcher = new indexsearcher(reader);
//创建一个普通的termquery
termquery termquery = new termquery(new term("title", "solr"));
//根据可以计算日期衰减因子的自定义valuesource来创建functionquery
functionquery functionquery = new functionquery(new datedampingvaluesouce("publishdate"));
//自定义评分查询[customscorequery将普通query和functionquery组合在一起,至于两者的query评分按什么算法计算得到最后得分,由用户自己去重写来干预评分]
//默认实现是把普通查询评分和functionquery高级查询评分相乘求积得到最终得分,你可以自己重写默认的实现
customscorequery customscorequery = new customscorequery(termquery, functionquery);
//创建排序器[按评分降序排序]
sort sort = new sort(new sortfield[] {sortfield.field_score});
topdocs topdocs = searcher.search(customscorequery, null, integer.max_value, sort,true,false);
scoredoc[] docs = topdocs.scoredocs;
for (scoredoc scoredoc : docs) {
int docid = scoredoc.doc;
document document = searcher.doc(docid);
string title = document.get("title");
string publishdatestring = document.get("publishdate");
system.out.println(publishdatestring);
long publishmills = long.valueof(publishdatestring);
date date = new date(publishmills);
publishdatestring = formate.format(date);
float score = scoredoc.score;
system.out.println(docid + " " + title + " " +
publishdatestring + " " + score);
reader.close();
directory.close();
* 创建document对象
* @param title 书名
* @param publishdatestring 书籍出版日期
* @return
* @throws parseexception
public static document createdocument(string title,string publishdatestring) throws parseexception {
date publishdate = formate.parse(publishdatestring);
document doc = new document();
doc.add(new textfield("title",title,field.store.yes));
doc.add(new longfield("publishdate", publishdate.gettime(),store.yes));
doc.add(new numericdocvaluesfield("publishdate", publishdate.gettime()));
return doc;
//创建测试索引
public static void createindex(directory directory) throws parseexception, ioexception {
analyzer analyzer = new standardanalyzer();
indexwriterconfig indexwriterconfig = new indexwriterconfig(analyzer);
indexwriterconfig.setopenmode(openmode.create_or_append);
indexwriter writer = new indexwriter(directory, indexwriterconfig);
//创建测试索引
document doc1 = createdocument("lucene in action 2th edition", "2010-05-05");
document doc2 = createdocument("lucene progamming", "2008-07-11");
document doc3 = createdocument("lucene user guide", "2014-11-24");
document doc4 = createdocument("lucene5 cookbook", "2015-01-09");
document doc5 = createdocument("apache lucene api 5.0.0", "2015-02-25");
document doc6 = createdocument("apache solr 4 cookbook", "2013-10-22");
document doc7 = createdocument("administrating solr", "2015-01-20");
document doc8 = createdocument("apache solr essentials", "2013-08-16");
document doc9 = createdocument("apache solr high performance", "2014-06-28");
document doc10 = createdocument("apache solr api 5.0.0", "2015-03-02");
writer.adddocument(doc1);
writer.adddocument(doc2);
writer.adddocument(doc3);
writer.adddocument(doc4);
writer.adddocument(doc5);
writer.adddocument(doc6);
writer.adddocument(doc7);
writer.adddocument(doc8);
writer.adddocument(doc9);
writer.adddocument(doc10);
writer.close();
运行测试结果如图:
demo代码请在最底下的附件里下载如果你需要的话,ok,打完收工!
如果你还有什么问题请加我Q-q:7-3-6-0-3-1-3-0-5,
或者加裙
一起交流学习!
转载:http://iamyida.iteye.com/blog/2201291