Tuesday, May 19, 2009

Compass - Lucene - Spring - Hibernate (1)

Compass is a framework over Apache Lucene delivering robust and useful services. Besides, it provides pluggable points for frameworks such as Spring and Hibernate. Though having a reference manual, putting it all together is somehow complicated and cumbersome. This would be probably multi-part how-to on this issue. Consider this scenario: we are developing a web application with all the stuff that I do not delve into. The main focus here is rotating around an entity called "Story". So, we'd have an HBM:
<hibernate-mapping package="ir.asta.wise.core.datamanagement.textsearch.sample.story">
<class name="StoryEntity" table="LUC_STORY">
  <id name="id" column="STORY_ID" type="java.lang.String">
   <generator class="uuid"></generator>
               </id>
  <property name="name" type="java.lang.String" column="NAME" null="true"></property>
               <property name="content" type="java.lang.String" column="CONTENT" null="true" length="4096"></property>
  <property name="blobContent" type="java.sql.Blob" column="BLOB_CONTENT">  </property>
</class>
The first step is to tell Compass to index Story: we create a file named Story.cpm.xml; the Core Mapping for Story:
<compass-core-mapping package="ir.asta.wise.core.datamanagement.textsearch.sample.story">
    <class name="StoryEntity" alias="StoryEntity">
        <id name="id" />
        <property name="content">
            <meta-data>${wiseCompass.story}</meta-data>
        </property>
        <property name="blobContent" converter="blobConverter">
            <meta-data>${wiseCompass.story}</meta-data>
        </property>
</class>
</compass-core-mapping>
Along this mapping, we need to introduce Story to compass through a meta-data descriptor, call it compass.cmd.xml that provides a big picture of all indexables:
<compass-core-meta-data>
    <meta-data-group id="wiseCompass" displayName="WiSE Core Lucene/Compasss">
        <description>WiSE Core Lucene/Compass Core Meta Data</description>
        <uri>http://wise/compass</uri>
        <meta-data id="story" displayName="Story Metadata">
            <description>Story Entity Compass Metadata</description>
            <uri>http://wise/compass/story</uri>
            <name>story</name>
        </meta-data>
        <meta-data id="file" displayName="File Entity Content Metadata">
            <description>File Entity Content Metadata</description>
            <uri>http://wise/compass/file</uri>
            <name>file</name>
        </meta-data>
    </meta-data-group>
</compass-core-meta-data>
Now, let's get to Spring configuration. First, we need to create a bean to be the Compass object:
    <bean id="compass" class="org.compass.spring.LocalCompassBean">
        <property name="dataSource" ref="dataSource" />
        <property name="transactionManager" ref="transactionManager" />
        <property name="resourceLocations">
            <list>
                <value>classpath:config/lucene/compass/*.xml</value>
            </list>
        </property>
        <property name="compassSettings">
            <props>
                <prop key="compass.name">compass</prop>
                <prop key="compass.engine.connection">jdbc://</prop>
                <prop key="compass.converter.blobConverter.registerClass">java.sql.Blob</prop>
                <prop key="compass.converter.blobConverter.type">BlobConverter</prop>
            </props>
        </property>
    </bean>
Some comments on the 'compass' bean configuration:
  • transactionManager is the reference to your bean of Spring that is the actual transaction manager that is also used for Hibernate.
  • resourceLocations is an option to tell where all the *.cpm.xml and *.cmd.xml files are.
  • As pure Lucene does not implement the JDBC Directory concept, through jdbc:// we are telling Compass that we're using JdbcDirectory implementation of Compass and for that dataSource is injected.
  • In this example, we aim to index and search BLOB types (such as PDF or Document). So we need to configure Compass for our BLOB converters. I'd discuss this more in the second part of the tutorial.
The next steps fall into two parts: saving the index and searching it. For both, we need a Compass Session that is also bound to the Compass bean with all the Hibernate bindings. To do so, we need to define two other beans for adding Hibernate collaboration for Compass:
    <bean id="hibernateGpsDevice" class="org.compass.gps.device.hibernate.HibernateGpsDevice">
        <property name="name">
            <value>Hibernate-GPS-Device</value>
        </property>
        <property name="sessionFactory" ref="sessionFactory" />
        <property name="nativeExtractor">
            <bean class="org.compass.spring.device.hibernate.SpringNativeHibernateExtractor" />
        </property>
    </bean>
And:
    <bean id="hibernateGps" class="org.compass.gps.impl.SingleCompassGps" init-method="start" destroy-method="stop">
        <property name="compass" ref="compass" />
        <property name="gpsDevices">
            <list>
                <ref bean="hibernateGpsDevice" />
            </list>
        </property>
    </bean>
Now, we can use the hibernateGps bean to for using Compass Session API. To save an index, we assume that a StoryEntity has been saved and we want to save the index:
    @Transactional(readOnly = false)
    private void saveStoryIndex(StoryEntity s) {
        CompassIndexSession session = hibernateGps.getIndexCompass().openIndexSession();
        session.save(s);
        session.commit();
        logger.warn("Index saved.");
        session.close();
    }
And, to search:
    @Transactional(readOnly = false)
    private void searchSomeStory() {
        logger.warn("Searching....");
        CompassSearchSession session = hibernateGps.getIndexCompass().openSearchSession();
        CompassHits hits = session.find("sample");
        logger.warn("Hits: " + hits.getLength());
        logger.warn("First result: " + hits.hit(0).data());
        session.close();
    }
This is a brief overview on what the integration needs. On the next part, I'd discuss on how indexing and converting of PDF's and Document's could be handled. Hope this would help.

No comments:

Post a Comment