Full-Text Search with Apache Lucene in Java

Full-Text Search with Apache Lucene in Java –

Latest Version: Apache Lucene 4.7.0

What is Lucene?

Apache Lucene is a high-performance, full-featured text search engine library written in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Why to use Lucene:

1) It supports scalable, high performance indexing
2) It has efficient search algorithms
3) It is open source
4) Written in 100%-pure Java

How Lucene Works:

lucence-flow

Installing Lucene:

Download lucene latest version from http://lucene.apache.org/core/downloads.html

Direct Link – http://mirror.tcpdiag.net/apache/lucene/java/4.7.0/lucene-4.7.0.zip

Unzip it and browse through the folder, you will get lucene-core-4.7.0.jar under lucene-4.7.0\core, which is the main jar file and other folder contains supported jars.

Add the required jar to your project build path.

Core indexing classes:

  • IndexWriter
  • Directory
  • Analyzer
  • Document
  • Field

IndexWriter writes content in index, it is needed to add Documents to an IndexWriter.

Directory is the location where the index will be stored.

A Document is the atomic unit of indexing and searching, document contains Fields.

Fields have a name and a value, like search a field using name:term, e.g., title:java. Different documents can have different fields.

Analyzers are used to process document. It break up text into indexed tokens, like terms, and also optionally perform other operations on these tokens, e.g. downcasing, synonym insertion, filtering out unwanted tokens, etc.

Core searching classes:

  • IndexSearcher
  • Query
  • QueryParser
  • TopDocs
  • ScoreDoc

Query parser is constructed with an analyzer used to interpret your query text in the same way the documents are interpreted: finding word boundaries, downcasing, and removing useless words like ‘a’, ‘an’ and ‘the’.

Query object contains the results from the QueryParser which is passed to the searcher. Note that it’s also possible to programmatically construct a rich Query object without using the query parser.

 Example:

 System Requirements:

1) Eclipse

2) Jdk 1.6.x

3) lucene 4.7

lucene_proj_dirlist

LuceneExample.java

OutPut of the above program:

Found 2 hits.
1. 20493988179    Spring in Action
2. 3852735833535    Creating Spring WebServices

Gopal Das
Follow me

Gopal Das

Founder at GopalDas.Org
He is a technology evangelist, Salesforce trainer, blogger, and working as a Salesforce Technical Lead. After working in Java based project implementation, he jumped to the Salesforce system on a whim and never looked back. He fell in love with Salesforce’s flexibility, scalability, and power. He expanded his knowledge of the platform and became a Certified App Builder, Administrator, Platform Developer I, SalesCloud Consultant while leading the Salesforce implementation and technology needs. He has worked in a wide variety of applications/services like desktop, web and mobile applications.
Gopal Das
Follow me

Leave a Reply

Your email address will not be published. Required fields are marked *