Showing posts with label SOLR Search. Show all posts
Showing posts with label SOLR Search. Show all posts

Sunday, 30 December 2018

Indexing Process in Solr

The Following post defines how exactly the Indexing process in Solr works.

When we take the Indexing part there are multiple ways we can achieve in Solr such as.

  • Indexing using the post.jar
  • Indexing using the dataImport handlers.
  • Indexing by executing the curl commands.

We will concentrate more on the first two pointers of Indexing.

Indexing using the post.jar

As I already mentioned in previous posts that the Solr Ships with the exampleDocs from where we can do getting started.

Navigate to C:\Dev\solr-7.5.0\example\exampledocs

In this folder we have the sample xml and json files,using which we can use to Index the data.Also In the Same folder We have the post.jar that process these documents and Index it .

C:\Dev\solr-7.5.0\bin>java -jar -Dc=example -Dauto C:\Dev\solr-7.5.0\example\exampledocs\post.jar C:\MicroservicesPOC\solr-7.5.0\solr-7.5.0\example\exampledocs\ .*

Where -Dc is the name of the core.

-Dauto is the location where the post.jar resides.

This post.jar reads the collection and Index the documents given to it. But the condition here is that we have to follow the format the post.jar expects, otherwise the Indexing will not happen.

Indexing using the dataImport handlers

For the Second way of Indexing checkout my detailed post here using the DataImport handler.

Happy Indexing!!!

Monday, 24 December 2018

Core Creation in Solr

Before Starting anything into the Solr We have to create the Core. A Core is a running instance or the process of a Lucene index that contains all the Solr configuration files. We need to create a Core to perform operations like indexing and analyzing. It is mentioned that the Solr application may contain one core or more and can communicate with multiple cores.

Its similar to Creating the Endeca App. The Core can be created in two ways.


Creating through Command

Navigate to C:\Dev\solr-7.5.0\bin>solr.cmd create -c example

It will create the core for the Solr.

WARNING: Using _default configset with data have driven schema functionality. NOT RECOMMENDED for production use.
         To turn off: bin\solr config -c example -p 8983 -action set-user-property -property update.autoCreateFields -value false
INFO  - 2018-12-25 12:03:30.613; org.apache.solr.util.configuration.SSLCredentialProviderFactory; Processing SSL Credential Provider chain: env;sysprop

Created a new core 'example'

Creating through Solr Admin UI

Navigate to AdminUI>core Admin>Add Core

Fill the following popup

name:<name of the solr>

instanceDir:<Directory where the solr-Config.xml is avalible> In case if the solr-Config.xml is not created the We can use it from the default config set that comes up with Solr.

C:\Dev\solr-7.5.0\server\solr\example

Copy the directory conf from the

C:\Dev\solr-7.5.0\server\solr\configsets\_default

to

C:\Dev\solr-7.5.0\server\solr\example\conf

Give the Instance Directory as C:\Dev\solr-7.5.0\server\solr\example

dataDir:<Directory Where the Indexing Files stored>

C:\Dev\solr-7.5.0\server\solr\example\data

Remaining config and schema leave it as it.

Here understanding two folder structure is important.

conf> Where the solr configurations are stored.

data>Where the indexed data files are stored in the non readable format.

Happy Coring !!!!

Understanding Solr and Admin Console

We have seen how to download and install from our previous posts, Now its time to Understand it further.
After unzipping into the Folder Observe the Folder Structure.

Folder Structure

C:\Dev\solr-7.5.0\




bin> This will be having the Command Files. From where we will start/stop the Solr.

Contrib> This Will have add-on plugins for specialized features of Solr

dist> This will have the main Solr .jar files.

docs> This will have the link for the online documentation.

example> This will have the example docs which can be used for learning and getting started purpose.

licenses> The licenses directory includes all of the licenses for 3rd party libraries used by Solr.

server> This is the core of the Solr, Official documentation defines it as a heart. This will have the Following

server>solr-webapp> -->Solr’s Admin UI

server>lib> -->Jetty libraries

server>logs> -->Log files

server>solr>configsets> --> Sample configsets


Solr Admin UI.

Solr has the default admin UI that can be accessed via the port number 8983

http://127.0.0.1:8983/solr/ or http://localhost:8983/solr/ 





This will have the core selector, Logging, Schema selection, Query  Execution, memory stats and more. For the developers from Endeca, it can also Similar to the jspref Orange Application in Endeca, with more features.

Happy Structuring !!!

Installing Solr

The Installation of Solr is very simple. On Comparing with other Search Platforms Which I worked, this is considered to be the simplest one in terms of installation.

Download the zip file from the official site of Solr. Its always good Practise to move to some development folder to proceed, instead of having it in the Downloads folder.

Prerequesties:

Make sure your java is compatible with the version of Solr you download.

Make Sure your java home and path variable is set.

Follow the below steps for Installation.

1. Unzip the zip file which we downloaded. Usually, the File is in the Following format solr-7.X.X.zip

Once you unzip it. Congratulations you are done with your installation.

We have a walk through explanation on the folder structure in a different post.

Starting the Solr.

Normal Mode

Consider my Solr in the following Directory C:\Dev\solr-7.5.0 then

Navigate to C:\Dev\solr-7.5.0\bin open the command prompt in this location

and execute the Following command C:\Dev\solr-7.5.0\bin>solr.cmd start

Solr is started with the following logs on the prompt.

INFO  - 2018-12-25 10:39:32.458; org.apache.solr.util.configuration.SSLCredentialProviderFactory; Processing SSL Credential Provider chain: env;sysprop
Waiting up to 30 to see Solr running on port 8983
Started Solr server on port 8983. Happy searching!

Debug Mode

If you want the solr in Debug mode then execute the below command in the same location.

C:\Dev\solr-7.5.0\bin>solr.cmd start -a "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=18983"

This will start the solr in debug mode with listening to the port 18983. if you need detailed explannation check my post here.

By Default Solr is running on the port 8983, If that port is already occupied either stop the process in that port or start the solr with different port.

Stopping the Process running on the port

1.Identifying the process running on the port.(Windows)

netstat -ano | findstr :8983 will list all the ports currently running in the machine.

2.Killing the Process running on the port (Windows)

taskkill /PID <PID_NO> /F

C:\Dev\solr-7.5.0\bin>netstat -ano | findstr :8983
  TCP    0.0.0.0:8983           0.0.0.0:0              LISTENING       18596
  TCP    [::]:8983              [::]:0                 LISTENING       18596

C:\Dev\solr-7.5.0\bin>taskkill /PID 18596 /F
SUCCESS: The process with PID 18596 has been terminated.

Starting the Solr in a different port.

C:\Dev\solr-7.5.0\bin>solr.cmd start -p 8990

This will start the solr in different port.


Stopping the solr.


C:\Dev\solr-7.5.0\bin>solr.cmd stop -all

This will stop if the solr is running in all the ports.

Happy Installation !!!!!

Monday, 7 August 2017

Customizing “q” Parameter in SOLR

Hi All,
         Today we are going to see how we are going to customize the Search parameter in solr. Almost all of our Projects will be having the Requirement of Customizing this parameter. Please follow the Below Instructions so you can change it easily.

package com.mycommercesearch.solr;
              
import org.apache.solr.common.params.SolrParams;
import org.apache.solr.common.util.NamedList;
import org.apache.solr.handler.component.SearchHandler;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;

public class SearchEndecaHanlder extends SearchHandler {
      
       public static String SEARCH="search";
      
       public static String QUERY_PARAM="q";

       public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)   throws Exception{
              String queryParam=req.getParams().get(SEARCH);
              SolrParams paramSolrParams = req.getParams();
              NamedList<Object> nmList = new NamedList<Object>();
              nmList=paramSolrParams.toNamedList();
              nmList.remove(SEARCH);
              nmList.add(QUERY_PARAM, queryParam);
              paramSolrParams=SolrParams.toSolrParams(nmList);
              req.setParams(paramSolrParams);
              super.handleRequestBody(req, rsp);
       }
}

In this Scenario I have used "search" instead of  "q" and it worked out. Create a Jar File from the above class and paste it in <SOLR_ISTALED_DIR>solr-6.6.0\server\solr-webapp\webapp\WEB-INF\lib\

Registering the Custom Handler in Solr

Navigate to Solrconfig.xml and add the below entry.

<requestHandler name="/mysearch" class="com.mycommercesearch.solr.SearchEndecaHanlder">
    <lst name="defaults">
      <str name="echoParams">explicit</str>
      <int name="rows">10</int>
      <!-- <str name="df">text</str> -->
    </lst>
  </requestHandler>

Once if you restart and access with the Below Url and get the Results .



Happy Searching !!!!

Monday, 31 July 2017

Defining multiple entity In Solr

Most of us when implementing the Search for the Site , the data we are going to process is not from the Same Table and same fields , for information on how to Index the Data From Database can be seen in my previous blog here. This deals only with data from multiple datasources or the data from different tables here.

Navigate to db-data-config .xml and edit it. I am going to setup the Customer Data for Search here.

<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
<dataSource name="ds1" type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/classicmodels" user="root" password="root"/>
<dataSource name="ds2" type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/customerdata" user="root" password="root"/>

Here is the place where, I can define the different datasources . Here I have configured two types of datasources, one is called ds1 and another called as ds2. You can have different Set like hsql and XML also defined for processing.

<document>
   <entity name="products" dataSource="ds1" pk="id" query="select * from products" deltaImportQuery="select * from products"
   deltaQuery="select * from products where last_modified > '${dataimporter.last_index_time}'">
     <field column="productCode" name="id"/>
     <field column="productName" name="name"/>
     <field column="productDescription" name="description"/>
                 <field column="productLine" name="category"/>
    </entity>    
    
In the products entity, unique key is the id and we can mention the data source also here.         

   <entity name="customers" dataSource="ds2"  pk="customerNumber" query="select * from customers" deltaImportQuery="select * from customers"
   deltaQuery="select * from customers where last_modified > '${dataimporter.last_index_time}'">
     <field column="customerNumber" name="id"/>
     <field column="customerName" name="customerName"/>
     <field column="contactLastName" name="contactLastName"/>
                 <field column="contactFirstName" name="contactFirstName"/>      
                 <field column="phone" name="phone"/>
                 <field column="addressLine1" name="addressLine1"/>
                 <field column="addressLine2" name="addressLine2"/>
                 <field column="city" name="city"/>
                 <field column="state" name="state"/>
                 <field column="postalCode" name="postalCode"/>
                 <field column="country" name="country"/>
     <field column="salesRepEmployeeNumber" name="salesRepEmployeeNumber"/>
                 <field column="creditLimit" name="creditLimit"/>
  </entity>
</document>
</dataConfig>


If you are introducing the new entity it is must to have field called the id , which is used for the uniqueness of the records.

Querying for products


Querying for customers


df is the data fields that holds this indexed data. Refrence is the datafield for products and customer is the datafield for customers.


Errors:

2017-04-22 10:24:17.256 WARN  (Thread-14) [   x:refrence] o.a.s.h.d.SolrWriter Error creating document : SolrInputDocument(fields: [category=Ships, id=S72_3212, name=Pont Yacht, description=Measures 38 inches Long x 33 3/4 inches High. Includes a stand.
Many extras including rigging, long boats, pilot house, anchors, etc. Comes with 2 masts, all square-rigged, _version_=1565373662235721728])
org.apache.solr.common.SolrException: [doc=S72_3212] missing required field: city


When you face the error you have to remove the field required=”true ” or make it “false” in managed-schema.xml as like below
                <field name="city" type="string" indexed="true" stored="true" required="false" multiValued="false" />

<field name="city" type="string" indexed="true" stored="true" multiValued="false" />


If you face below error

Solr Error Document is missing mandatory uniqueKey field id


It means your document does not have the property id which is defined like below in  <uniqueKey>id</uniqueKey> in managed-schema.xml


Happy learning !!!!