主頁 > 知識(shí)庫 > java Lucene 中自定義排序的實(shí)現(xiàn)

java Lucene 中自定義排序的實(shí)現(xiàn)

熱門標(biāo)簽:開封便宜外呼系統(tǒng)報(bào)價(jià) 黃石智能營(yíng)銷電銷機(jī)器人效果 怎樣把地圖標(biāo)注出來 淮南騰訊地圖標(biāo)注 騰訊地圖標(biāo)注商戶改名注冊(cè)入駐 電話機(jī)器人的特色和創(chuàng)新 漯河辦理400電話 地圖標(biāo)注人員兼職 商丘百應(yīng)電話機(jī)器人有沒有效果
Lucene中的自定義排序功能和Java集合中的自定義排序的實(shí)現(xiàn)方法差不多,都要實(shí)現(xiàn)一下比較接口. 在Java中只要實(shí)現(xiàn)Comparable接口就可以了.但是在Lucene中要實(shí)現(xiàn)SortComparatorSource接口和ScoreDocComparator接口.在了解具體實(shí)現(xiàn)方法之前先來看看這兩個(gè)接口的定義吧.
SortComparatorSource接口的功能是返回一個(gè)用來排序ScoreDocs的comparator(Expert: returns a comparator for sorting ScoreDocs).該接口只定義了一個(gè)方法.如下:
Java代碼
/**
* Creates a comparator for the field in the given index.
* @param reader - Index to create comparator for.
* @param fieldname - Field to create comparator for.
* @return Comparator of ScoreDoc objects.
* @throws IOException - If an error occurs reading the index.
*/
public ScoreDocComparator newComparator(IndexReader reader,String fieldname) throws IOException
view plaincopy to clipboardprint?
/**
* Creates a comparator for the field in the given index.
* @param reader - Index to create comparator for.
* @param fieldname - Field to create comparator for.
* @return Comparator of ScoreDoc objects.
* @throws IOException - If an error occurs reading the index.
*/
public ScoreDocComparator newComparator(IndexReader reader,String fieldname) throws IOException
/**
* Creates a comparator for the field in the given index.
* @param reader - Index to create comparator for.
* @param fieldname - Field to create comparator for.
* @return Comparator of ScoreDoc objects.
* @throws IOException - If an error occurs reading the index.
*/
public ScoreDocComparator newComparator(IndexReader reader,String fieldname) throws IOException
該方法只是創(chuàng)造一個(gè)ScoreDocComparator 實(shí)例用來實(shí)現(xiàn)排序.所以我們還要實(shí)現(xiàn)ScoreDocComparator 接口.來看看ScoreDocComparator 接口.功能是比較來兩個(gè)ScoreDoc 對(duì)象來排序(Compares two ScoreDoc objects for sorting) 里面定義了兩個(gè)Lucene實(shí)現(xiàn)的靜態(tài)實(shí)例.如下:
Java代碼
//Special comparator for sorting hits according to computed relevance (document score).
public static final ScoreDocComparator RELEVANCE;
//Special comparator for sorting hits according to index order (document number).
public static final ScoreDocComparator INDEXORDER;
view plaincopy to clipboardprint?
//Special comparator for sorting hits according to computed relevance (document score).
public static final ScoreDocComparator RELEVANCE;
//Special comparator for sorting hits according to index order (document number).
public static final ScoreDocComparator INDEXORDER;
//Special comparator for sorting hits according to computed relevance (document score).
public static final ScoreDocComparator RELEVANCE;

//Special comparator for sorting hits according to index order (document number).
public static final ScoreDocComparator INDEXORDER;
有3個(gè)方法與排序相關(guān),需要我們實(shí)現(xiàn) 分別如下:
Java代碼
/**
* Compares two ScoreDoc objects and returns a result indicating their sort order.
* @param i First ScoreDoc
* @param j Second ScoreDoc
* @return -1 if i should come before j;
* 1 if i should come after j;
* 0 if they are equal
*/
public int compare(ScoreDoc i,ScoreDoc j);
/**
* Returns the value used to sort the given document. The object returned must implement the java.io.Serializable interface. This is used by multisearchers to determine how to collate results from their searchers.
* @param i Document
* @return Serializable object
*/
public Comparable sortValue(ScoreDoc i);
/**
* Returns the type of sort. Should return SortField.SCORE, SortField.DOC, SortField.STRING, SortField.INTEGER, SortField.FLOAT or SortField.CUSTOM. It is not valid to return SortField.AUTO. This is used by multisearchers to determine how to collate results from their searchers.
* @return One of the constants in SortField.
*/
public int sortType();
view plaincopy to clipboardprint?
/**
* Compares two ScoreDoc objects and returns a result indicating their sort order.
* @param i First ScoreDoc
* @param j Second ScoreDoc
* @return -1 if i should come before j;
* 1 if i should come after j;
* 0 if they are equal
*/
public int compare(ScoreDoc i,ScoreDoc j);
/**
* Returns the value used to sort the given document. The object returned must implement the java.io.Serializable interface. This is used by multisearchers to determine how to collate results from their searchers.
* @param i Document
* @return Serializable object
*/
public Comparable sortValue(ScoreDoc i);
/**
* Returns the type of sort. Should return SortField.SCORE, SortField.DOC, SortField.STRING, SortField.INTEGER, SortField.FLOAT or SortField.CUSTOM. It is not valid to return SortField.AUTO. This is used by multisearchers to determine how to collate results from their searchers.
* @return One of the constants in SortField.
*/
public int sortType();
/**
     * Compares two ScoreDoc objects and returns a result indicating their sort order.
     * @param i First ScoreDoc
     * @param j Second ScoreDoc
     * @return -1 if i should come before j;
     * 1 if i should come after j;
     * 0 if they are equal
     */
    public int compare(ScoreDoc i,ScoreDoc j);
    /**
     * Returns the value used to sort the given document. The object returned must implement the java.io.Serializable interface. This is used by multisearchers to determine how to collate results from their searchers.
     * @param i Document
     * @return Serializable object
     */
    public Comparable sortValue(ScoreDoc i);
    /**
     * Returns the type of sort. Should return SortField.SCORE, SortField.DOC, SortField.STRING, SortField.INTEGER, SortField.FLOAT or SortField.CUSTOM. It is not valid to return SortField.AUTO. This is used by multisearchers to determine how to collate results from their searchers.
     * @return One of the constants in SortField.
     */
    public int sortType();
看個(gè)例子吧!
該例子為L(zhǎng)ucene in Action中的一個(gè)實(shí)現(xiàn),用來搜索距你最近的餐館的名字. 餐館坐標(biāo)用字符串"x,y"來存儲(chǔ).
Java代碼
package com.nikee.lucene;
import java.io.IOException;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.TermDocs;
import org.apache.lucene.index.TermEnum;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.ScoreDocComparator;
import org.apache.lucene.search.SortComparatorSource;
import org.apache.lucene.search.SortField;
//實(shí)現(xiàn)了搜索距你最近的餐館的名字. 餐館坐標(biāo)用字符串"x,y"來存儲(chǔ)
//DistanceComparatorSource 實(shí)現(xiàn)了SortComparatorSource接口
public class DistanceComparatorSource implements SortComparatorSource {
private static final long serialVersionUID = 1L;
// x y 用來保存 坐標(biāo)位置
private int x;
private int y;
public DistanceComparatorSource(int x, int y) {
this.x = x;
this.y = y;
}
// 返回ScoreDocComparator 用來實(shí)現(xiàn)排序功能
public ScoreDocComparator newComparator(IndexReader reader, String fieldname) throws IOException {
return new DistanceScoreDocLookupComparator(reader, fieldname, x, y);
}
//DistanceScoreDocLookupComparator 實(shí)現(xiàn)了ScoreDocComparator 用來排序
private static class DistanceScoreDocLookupComparator implements ScoreDocComparator {
private float[] distances; // 保存每個(gè)餐館到指定點(diǎn)的距離
// 構(gòu)造函數(shù) , 構(gòu)造函數(shù)在這里幾乎完成所有的準(zhǔn)備工作.
public DistanceScoreDocLookupComparator(IndexReader reader, String fieldname, int x, int y) throws IOException {
System.out.println("fieldName2="+fieldname);
final TermEnum enumerator = reader.terms(new Term(fieldname, ""));
System.out.println("maxDoc="+reader.maxDoc());
distances = new float[reader.maxDoc()]; // 初始化distances
if (distances.length > 0) {
TermDocs termDocs = reader.termDocs();
try {
if (enumerator.term() == null) {
throw new RuntimeException("no terms in field " + fieldname);
}
int i = 0,j = 0;
do {
System.out.println("in do-while :" + i ++);
Term term = enumerator.term(); // 取出每一個(gè)Term
if (term.field() != fieldname) // 與給定的域不符合則比較下一個(gè)
break;
//Sets this to the data for the current term in a TermEnum.
//This may be optimized in some implementations.
termDocs.seek(enumerator); //參考TermDocs Doc
while (termDocs.next()) {
System.out.println(" in while :" + j ++);
System.out.println(" in while ,Term :" + term.toString());
String[] xy = term.text().split(","); // 去處x y
int deltax = Integer.parseInt(xy[0]) - x;
int deltay = Integer.parseInt(xy[1]) - y;
// 計(jì)算距離
distances[termDocs.doc()] = (float) Math.sqrt(deltax * deltax + deltay * deltay);
}
}
while (enumerator.next());
} finally {
termDocs.close();
}
}
}
//有上面的構(gòu)造函數(shù)的準(zhǔn)備 這里就比較簡(jiǎn)單了
public int compare(ScoreDoc i, ScoreDoc j) {
if (distances[i.doc] distances[j.doc])
return -1;
if (distances[i.doc] > distances[j.doc])
return 1;
return 0;
}
// 返回距離
public Comparable sortValue(ScoreDoc i) {
return new Float(distances[i.doc]);
}
//指定SortType
public int sortType() {
return SortField.FLOAT;
}
}
public String toString() {
return "Distance from (" + x + "," + y + ")";
}
}
view plaincopy to clipboardprint?
package com.nikee.lucene;
import java.io.IOException;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.TermDocs;
import org.apache.lucene.index.TermEnum;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.ScoreDocComparator;
import org.apache.lucene.search.SortComparatorSource;
import org.apache.lucene.search.SortField;
//實(shí)現(xiàn)了搜索距你最近的餐館的名字. 餐館坐標(biāo)用字符串"x,y"來存儲(chǔ)
//DistanceComparatorSource 實(shí)現(xiàn)了SortComparatorSource接口
public class DistanceComparatorSource implements SortComparatorSource {
private static final long serialVersionUID = 1L;
// x y 用來保存 坐標(biāo)位置
private int x;
private int y;
public DistanceComparatorSource(int x, int y) {
this.x = x;
this.y = y;
}
// 返回ScoreDocComparator 用來實(shí)現(xiàn)排序功能
public ScoreDocComparator newComparator(IndexReader reader, String fieldname) throws IOException {
return new DistanceScoreDocLookupComparator(reader, fieldname, x, y);
}
//DistanceScoreDocLookupComparator 實(shí)現(xiàn)了ScoreDocComparator 用來排序
private static class DistanceScoreDocLookupComparator implements ScoreDocComparator {
private float[] distances; // 保存每個(gè)餐館到指定點(diǎn)的距離
// 構(gòu)造函數(shù) , 構(gòu)造函數(shù)在這里幾乎完成所有的準(zhǔn)備工作.
public DistanceScoreDocLookupComparator(IndexReader reader, String fieldname, int x, int y) throws IOException {
System.out.println("fieldName2="+fieldname);
final TermEnum enumerator = reader.terms(new Term(fieldname, ""));
System.out.println("maxDoc="+reader.maxDoc());
distances = new float[reader.maxDoc()]; // 初始化distances
if (distances.length > 0) {
TermDocs termDocs = reader.termDocs();
try {
if (enumerator.term() == null) {
throw new RuntimeException("no terms in field " + fieldname);
}
int i = 0,j = 0;
do {
System.out.println("in do-while :" + i ++);
Term term = enumerator.term(); // 取出每一個(gè)Term
if (term.field() != fieldname) // 與給定的域不符合則比較下一個(gè)
break;
//Sets this to the data for the current term in a TermEnum.
//This may be optimized in some implementations.
termDocs.seek(enumerator); //參考TermDocs Doc
while (termDocs.next()) {
System.out.println(" in while :" + j ++);
System.out.println(" in while ,Term :" + term.toString());
String[] xy = term.text().split(","); // 去處x y
int deltax = Integer.parseInt(xy[0]) - x;
int deltay = Integer.parseInt(xy[1]) - y;
// 計(jì)算距離
distances[termDocs.doc()] = (float) Math.sqrt(deltax * deltax + deltay * deltay);
}
}
while (enumerator.next());
} finally {
termDocs.close();
}
}
}
//有上面的構(gòu)造函數(shù)的準(zhǔn)備 這里就比較簡(jiǎn)單了
public int compare(ScoreDoc i, ScoreDoc j) {
if (distances[i.doc] distances[j.doc])
return -1;
if (distances[i.doc] > distances[j.doc])
return 1;
return 0;
}
// 返回距離
public Comparable sortValue(ScoreDoc i) {
return new Float(distances[i.doc]);
}
//指定SortType
public int sortType() {
return SortField.FLOAT;
}
}
public String toString() {
return "Distance from (" + x + "," + y + ")";
}
}
package com.nikee.lucene;
import java.io.IOException;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.TermDocs;
import org.apache.lucene.index.TermEnum;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.ScoreDocComparator;
import org.apache.lucene.search.SortComparatorSource;
import org.apache.lucene.search.SortField;
//實(shí)現(xiàn)了搜索距你最近的餐館的名字. 餐館坐標(biāo)用字符串"x,y"來存儲(chǔ)
//DistanceComparatorSource 實(shí)現(xiàn)了SortComparatorSource接口
public class DistanceComparatorSource implements SortComparatorSource {
    private static final long serialVersionUID = 1L;

    // x y 用來保存 坐標(biāo)位置
    private int x;
    private int y;

    public DistanceComparatorSource(int x, int y) {
        this.x = x;
        this.y = y;
    }

    // 返回ScoreDocComparator 用來實(shí)現(xiàn)排序功能
    public ScoreDocComparator newComparator(IndexReader reader, String fieldname) throws IOException {
        return new DistanceScoreDocLookupComparator(reader, fieldname, x, y);
    }

    //DistanceScoreDocLookupComparator 實(shí)現(xiàn)了ScoreDocComparator 用來排序
    private static class DistanceScoreDocLookupComparator implements ScoreDocComparator {
        private float[] distances; // 保存每個(gè)餐館到指定點(diǎn)的距離

        // 構(gòu)造函數(shù) , 構(gòu)造函數(shù)在這里幾乎完成所有的準(zhǔn)備工作.
        public DistanceScoreDocLookupComparator(IndexReader reader, String fieldname, int x, int y) throws IOException {
            System.out.println("fieldName2="+fieldname);
            final TermEnum enumerator = reader.terms(new Term(fieldname, ""));

            System.out.println("maxDoc="+reader.maxDoc());
            distances = new float[reader.maxDoc()]; // 初始化distances
            if (distances.length > 0) {
                TermDocs termDocs = reader.termDocs();
                try {
                    if (enumerator.term() == null) {
                        throw new RuntimeException("no terms in field " + fieldname);
                    }
                    int i = 0,j = 0;
                    do {
                        System.out.println("in do-while :" + i ++);
                        Term term = enumerator.term(); // 取出每一個(gè)Term
                        if (term.field() != fieldname) // 與給定的域不符合則比較下一個(gè)
                            break;

                        //Sets this to the data for the current term in a TermEnum.
                        //This may be optimized in some implementations.
                        termDocs.seek(enumerator); //參考TermDocs Doc
                        while (termDocs.next()) {
                            System.out.println(" in while :" + j ++);
                            System.out.println(" in while ,Term :" + term.toString());

                            String[] xy = term.text().split(","); // 去處x y
                            int deltax = Integer.parseInt(xy[0]) - x;
                            int deltay = Integer.parseInt(xy[1]) - y;
                            // 計(jì)算距離
                            distances[termDocs.doc()] = (float) Math.sqrt(deltax * deltax + deltay * deltay);
                        }
                    }
                    while (enumerator.next());
                } finally {
                    termDocs.close();
                }
            }
        }
        //有上面的構(gòu)造函數(shù)的準(zhǔn)備 這里就比較簡(jiǎn)單了
        public int compare(ScoreDoc i, ScoreDoc j) {
            if (distances[i.doc] distances[j.doc])
                return -1;
            if (distances[i.doc] > distances[j.doc])
                return 1;
            return 0;
        }

        // 返回距離
        public Comparable sortValue(ScoreDoc i) {
            return new Float(distances[i.doc]);
        }

        //指定SortType
        public int sortType() {
            return SortField.FLOAT;
        }
    }

    public String toString() {
        return "Distance from (" + x + "," + y + ")";
    }
}
這是一個(gè)實(shí)現(xiàn)了上面兩個(gè)接口的兩個(gè)類, 里面帶有詳細(xì)注釋, 可以看出 自定義排序并不是很難的. 該實(shí)現(xiàn)能否正確實(shí)現(xiàn),我們來看看測(cè)試代碼能否通過吧.
Java代碼
package com.nikee.lucene.test;
import java.io.IOException;
import junit.framework.TestCase;
import org.apache.lucene.analysis.WhitespaceAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.FieldDoc;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.Sort;
import org.apache.lucene.search.SortField;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopFieldDocs;
import org.apache.lucene.store.RAMDirectory;
import com.nikee.lucene.DistanceComparatorSource;
public class DistanceComparatorSourceTest extends TestCase {
private RAMDirectory directory;
private IndexSearcher searcher;
private Query query;
//建立測(cè)試環(huán)境
protected void setUp() throws Exception {
directory = new RAMDirectory();
IndexWriter writer = new IndexWriter(directory, new WhitespaceAnalyzer(), true);
addPoint(writer, "El Charro", "restaurant", 1, 2);
addPoint(writer, "Cafe Poca Cosa", "restaurant", 5, 9);
addPoint(writer, "Los Betos", "restaurant", 9, 6);
addPoint(writer, "Nico's Taco Shop", "restaurant", 3, 8);
writer.close();
searcher = new IndexSearcher(directory);
query = new TermQuery(new Term("type", "restaurant"));
}
private void addPoint(IndexWriter writer, String name, String type, int x, int y) throws IOException {
Document doc = new Document();
doc.add(new Field("name", name, Field.Store.YES, Field.Index.TOKENIZED));
doc.add(new Field("type", type, Field.Store.YES, Field.Index.TOKENIZED));
doc.add(new Field("location", x + "," + y, Field.Store.YES, Field.Index.UN_TOKENIZED));
writer.addDocument(doc);
}
public void testNearestRestaurantToHome() throws Exception {
//使用DistanceComparatorSource來構(gòu)造一個(gè)SortField
Sort sort = new Sort(new SortField("location", new DistanceComparatorSource(0, 0)));
Hits hits = searcher.search(query, sort); // 搜索
//測(cè)試
assertEquals("closest", "El Charro", hits.doc(0).get("name"));
assertEquals("furthest", "Los Betos", hits.doc(3).get("name"));
}
public void testNeareastRestaurantToWork() throws Exception {
Sort sort = new Sort(new SortField("location", new DistanceComparatorSource(10, 10))); // 工作的坐標(biāo) 10,10
//上面的測(cè)試實(shí)現(xiàn)了自定義排序,但是并不能訪問自定義排序的更詳細(xì)信息,利用
//TopFieldDocs 可以進(jìn)一步訪問相關(guān)信息
TopFieldDocs docs = searcher.search(query, null, 3, sort);
assertEquals(4, docs.totalHits);
assertEquals(3, docs.scoreDocs.length);
//取得FieldDoc 利用FieldDoc可以取得關(guān)于排序的更詳細(xì)信息 請(qǐng)查看FieldDoc Doc
FieldDoc fieldDoc = (FieldDoc) docs.scoreDocs[0];
assertEquals("(10,10) -> (9,6) = sqrt(17)", new Float(Math.sqrt(17)), fieldDoc.fields[0]);
Document document = searcher.doc(fieldDoc.doc);
assertEquals("Los Betos", document.get("name"));
dumpDocs(sort, docs); // 顯示相關(guān)信息
}
// 顯示有關(guān)排序的信息
private void dumpDocs(Sort sort, TopFieldDocs docs) throws IOException {
System.out.println("Sorted by: " + sort);
ScoreDoc[] scoreDocs = docs.scoreDocs;
for (int i = 0; i scoreDocs.length; i++) {
FieldDoc fieldDoc = (FieldDoc) scoreDocs[i];
Float distance = (Float) fieldDoc.fields[0];
Document doc = searcher.doc(fieldDoc.doc);
System.out.println(" " + doc.get("name") + " @ (" + doc.get("location") + ") -> " + distance);
}
}
}
您可能感興趣的文章:
  • java的arraylist排序示例(arraylist用法)
  • java ArrayList集合中的某個(gè)對(duì)象屬性進(jìn)行排序的實(shí)現(xiàn)代碼
  • Java ArrayList的不同排序方法
  • java教程之二個(gè)arraylist排序的示例分享
  • java對(duì)ArrayList排序代碼示例
  • java實(shí)現(xiàn)ArrayList根據(jù)存儲(chǔ)對(duì)象排序功能示例
  • java中實(shí)現(xiàn)Comparable接口實(shí)現(xiàn)自定義排序的示例
  • Java Collections.sort()實(shí)現(xiàn)List排序的默認(rèn)方法和自定義方法
  • Java針對(duì)ArrayList自定義排序的2種實(shí)現(xiàn)方法

標(biāo)簽:馬鞍山 紅河 鄭州 岳陽 亳州 大興安嶺 拉薩 武威

巨人網(wǎng)絡(luò)通訊聲明:本文標(biāo)題《java Lucene 中自定義排序的實(shí)現(xiàn)》,本文關(guān)鍵詞  java,Lucene,中,自定義,排序,;如發(fā)現(xiàn)本文內(nèi)容存在版權(quán)問題,煩請(qǐng)?zhí)峁┫嚓P(guān)信息告之我們,我們將及時(shí)溝通與處理。本站內(nèi)容系統(tǒng)采集于網(wǎng)絡(luò),涉及言論、版權(quán)與本站無關(guān)。
  • 相關(guān)文章
  • 下面列出與本文章《java Lucene 中自定義排序的實(shí)現(xiàn)》相關(guān)的同類信息!
  • 本頁收集關(guān)于java Lucene 中自定義排序的實(shí)現(xiàn)的相關(guān)信息資訊供網(wǎng)民參考!
  • 推薦文章