Querying entities and full-text search

Overview

The capabilities that Trails provides for searches depends on which modules you are using. If you are plugging in your own persistence layer, we'd recommend extending the persistence layer with operations that use the capabilities of your persistence layer. With trails-hibernate, you have the full power of Hibernate Criteria API at your disposal. Since generally all of the operations on the persistence layer generally go through Trails persistence service, it allows interesting options to modify the operation on the fly. For examples, some of the trails-security features are built using that technique. Also, in contrast to Ruby on Rails or Groovy that add search operations to your entities on the fly, Trails as a domain-driven and highly object-oriented framework implements the default search capabilities with "example" objects: you choose the type of an entity and fill in values to create your search criteria. Take a look at the examples to see this in action. Using Hibernate Criteria API you can create create conventional relational database queries in a more object-oriented manner. For some things, it's not enough though and you need either more expressiveness or more performance, in which case you can use HQL (Hibernate Query Language) or even traditional SQL queries, by-passing the persistence layer. However, there are some type of applications, such as web search engines, where the traditional relational queries just don't offer enough flexibility.

Full-text search means the ability to find partial string matches within a collection of strings. In todays applications, full-text search is often a "must have" requirement. Users want their applications to provide quick and relevant search results the same way Google does. They expect a single interface to create queries against the whole app business domain model. There are flexibility and performance limits of full-text search using your DB alone. When dealing with a small number of elements it is possible for the DB to directly scan the contents of the rows with each query. However, when you reach these limits, as the number of elements to search from grows too large or you need to constantly deal with a high number of search queries it is time to employ an Information Retrieval (IR) tool and harness the power of a dedicated solution for full-text searching. A dedicated search library is used to maintain indexes that allow text queries to be performed faster than a relational database can execute them. In the world of Java, Lucene is by far the most used implementation for it.

Full-text search in Trails

Compass is a Java Search Engine Framework built on top of Lucene. Compass Goal is: "Simplify the integration of Search Engine into any application"

Compass is a powerful, transactional Java Search Engine framework. Compass allows you to declaratively map your Object domain model to the underlying Search Engine, synchronizing data changes between search engine index and different datasources. Compass provides a high level abstraction on top of the Lucene low level API. Compass also implements fast index operations and optimization and introduces transaction capabilities to the Search Engine.

Compass provides simplified ORM like API when working with a Search Engine and extending Lucene supporting transactional index operations. It provides the ability to map Java Objects to the underlying Search Engine using Java 5 Annotations (aka OSEM - Object/Search Engine Mapping). It also integrates nicely with Hibernate for seamless indexing of database content based on ORM mapping and seamless mirroring of ORM operations to the Search Engine. Compass also integrates with Spring providing Spring transaction management integration, simplified configuration and AOP base classes.

In Trails 1.1 you will find a first draft of the Trails-Compass integration, the "compass-library" example.
There you will find a Trails implementation of the Compass Library Sample . The example contains a small library domain model, containing Author, Article and Book Objects. It also contains three more classes to handle the interaction between Trails and the Compass API.
Integrating Compass with Trails in order to add full-text search functionality is simple.
Compass already integrates nicely with Spring and Hibernate, and Trails is built on top of this two frameworks.

OSEM, Annotating Domain Objects

Compass makes the object to search engine mapping really easy.
Here are the annotations you can use:

Annotation Types Summary (most relevant)
@Searchable Marks a class as searchable. required
@SearchableId Specifies a searchable id on property or field of the Searchable class. required
@SearchableProperty Specifies a searchable property on property or field of the Searchable class. for simple types
@SearchableComponent Specifies a searchable component on property or field of the Searchable class. for complex types for which search result matches should return this object; Compass effectively adds that object's searchable data to this object's
@SearchableReference Specifies a searchable reference on property or field of the Searchable class. Indicates that a property should be returned along with the class instance itself. Also works for collection types

Here is a usage example:

@Searchable
@Entity
public class Author
{

	private Long id; // identifier
	private Name name = new Name();
	private Date birthday;
	private Set<Book> books = new HashSet<Book>();


	@SearchableId
	@Id
	@GeneratedValue(strategy = GenerationType.AUTO)
	@PropertyDescriptor(index = 0)
	public Long getId()
	{
		return id;
	}

	public void setId(Long id)
	{
		this.id = id;
	}

	@Embedded
	@SearchableComponent
	@PropertyDescriptor(index = 1)
	public Name getName()
	{
		return name;
	}

	public void setName(Name name)
	{
		this.name = name;
	}

	@SearchableProperty(name = "birthdayOrig")
	@SearchableMetaData(name = "birthday", format = "yyyy-MM-dd")
	@PropertyDescriptor(index = 2, format = "yyyy-MM-dd")
	public Date getBirthday()
	{
		return this.birthday;
	}

	public void setBirthday(Date birthday)
	{
		this.birthday = birthday;
	}


	@SearchableReference
	@ManyToMany(targetEntity = Book.class, cascade = CascadeType.ALL)
	public Set<Book> getBooks()
	{
		return books;
	}

	public void setBooks(Set<Book> books)
	{
		this.books = books;
	}


	public String toString()
	{
		return name.toString();
	}
}

For more information about Compass OSEM and its annotations check the Compass documentation.
http://www.opensymphony.com/compass/versions/1.2RC1/html/core-osem.html
http://www.opensymphony.com/compass/versions/1.2RC1/api/

Here are the rest of the Compass annotations:

Annotation Types Summary
@SearchableAllMetaData For Searchable classes, allows to control the "all" meta-data definitions per searchable class.
@SearchableAnalyzerProperty Specifies a Searchable class field/property that dynamically controls the anlayzer that will be used to analyze the class content.
@SearchableBoostProperty Specifies a Searchable class field/property that dynamically controls boost value of the class mapped based on its value.
@SearchableClassConverter Specifies a class as being "convertable" by Compass.
@SearchableConstant A constant meta-data that can be defined on a Searchable class.
@SearchableConstants Defines a collection of SearchableConstant associated with a Searchable class.
@SearchableDynamicMetaData A dynamic meta data evaluation of the given expression using an expression language library.
@SearchableDynamicMetaDatas Defines a collection of SearchableDynamicMetaData associated with a Searchable class.
@SearchableMetaData Sepcifies additional meta-data on a SearchableProperty or SearchableId.
@SearchableMetaDatas Defines a collection of SearchableMetaData associated with a Searchable class field/property.
@SearchableParent Specifies a parent reference for SearchableComponent.
@SearchableSubIndexHash Configures a SubIndexHash associated with the given Searchable
@SearchAnalyzer Configure Analyzer to be used within Compass.
@SearchAnalyzerFilter Configures a LuceneAnalyzerTokenFilterProvider to be used within Compass.
@SearchAnalyzerFilters Defines a collection of SearchAnalyzerFilters.
@SearchAnalyzers Defines a collection of SearchAnalyzers.
@SearchConverter Configure Converter to be used within Compass.
@SearchConverters Defines a collection of SearchConverters.
@SearchSetting A general search setting applied to different search annotations.

Configurations and Integration

All the Compass related configuration is in the "compassContext.xml" file. In this configuration file there is only one bean that needs your attention and care: LocalCompassBean.

<bean id="compass" class="org.compass.spring.LocalCompassBean">
		<property name="classMappings">
			<list>
				<value>org.trails.compass.sample.library.Article</value>
				<value>org.trails.compass.sample.library.Author</value>
				<value>org.trails.compass.sample.library.Book</value>
				<value>org.trails.compass.sample.library.Name</value>
			</list>
		</property>
		<property name="compassSettings">
			<props>
				<prop key="compass.engine.connection">ram://trails-compass</prop>
				<prop key="compass.transaction.factory">
					org.compass.spring.transaction.SpringSyncTransactionFactory
				</prop>
			</props>
		</property>
		<property name="transactionManager" ref="transactionManager"/>
	</bean>

The classMappings property is where you set all your classes that have an object to search engine mapping. Careful these could be a subset of your domain classes. If you didn't mark a class as @Searchable it shouldn't be here. That's why you can't use "persistenceService.allTypes" here.

The "compass.engine.connection" key is where you set the location of the index, it takes optional prefix like "file:" or "ram:"

Connection Description
file :// prefix or no prefix The path to the file system based index path, using default file handling. This is a JVM level setting for all the file based prefixes.
mmap:// prefix Uses Java 1.4 nio MMAp class. Considered slower than the general file system one, but might have memory benefits (according to the Lucene documentation). This is a JVM level setting for all the file based prefixes.
ram:// prefix Creates a memory based index, follows the Compass life-cycle. Created when the Compass is created, and disposed when Compass is closed.
jdbc:// prefix Holds the Jdbc url or Jndi (based on the DataSourceProviderconfigured). Allows storing the index within a database. This setting requires additional mandatory settings, please refer to the Search Engine Jdbc section. It is very IMPORTANT to read the Search Engine Jdbc section, especially in term of performance considerations.

Check the Compass documentation for more info about this. http://www.opensymphony.com/compass/versions/1.2RC1/html/core-settings.html
Once running you can browse the index using Luke http://www.getopt.org/luke/

Searching:

Compass and Lucene have identical query syntax, but their query syntax is not the same that the one that we are used to the most: Google query syntax.
Please take a look at the query syntax, in the Lucene documentation: http://lucene.apache.org/java/docs/queryparsersyntax.html

Links

http://lucene.apache.org/
http://lucene.apache.org/java/docs/queryparsersyntax.html
http://www.compassframework.org]
http://www.opensymphony.com/compass/versions/1.2RC1/html/index.html
http://docs.codehaus.org/display/GRAILS/Compass+and+Grails+how+to
http://www.getopt.org/luke/

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.