Showing posts with label Solr Interview. Show all posts
Showing posts with label Solr Interview. Show all posts

Friday, 22 May 2020

Apache Solr Interview Questions

Q1. What do you understand by the term Apache Lucene?
Apache Solr is a standalone full-text search platform to perform searches on multiple websites and index documents using XML and HTTP. Built on a Java Library called Lucence, Solr supports a rich schema specification for a wide range and offers flexibility in dealing with different document fields. It also consists of an extensive search plugin API for developing custom search behavior.
Supported by Apache Software Foundation, Apache Lucene is a free, open-source, high-performance text search engine library written in Java by Doug Cutting. Lucence facilitates full-featured searching, highlighting, indexing and spellchecking of documents in various formats like MS Office docs, HTML, PDF, text docs and others. Solr is built on top of lucene.
Advantages
Disadvantages
Has a powerful language structure
Learning the syntax consumes a lot of time
empowers clients to perform precise scans for every one of the
questions either it may be simple or complex
There is a requirement of expert programmers
who can write codes.
SolrJ is an API that makes it easy for Java applications to talk to Solr. SolrJ hides a lot of the details of connecting to Solr and allows your application to interact with Solr with simple high-level methods.
Both Solr and Elasticsearch are popular open source search engines built on top of Lucene. Both have vibrant communities and are well documented. The difference is in the way each builds a wrapper and implements features on top of Lucene.



Q2. Describe the term Request Handler.
A Request Handler is basically a plugin, which handles approaching solicitations with a specific goal in mind. At the point when a client runs a search in Solr, a request handler prepares the inquiry question. SolrRequestHandler is the Solr Plugin that represents the logic to be performed at any request.

Q3. List the different type of information that can be retrieved from a field type.
The different type of information that can be retrieved from a field type include the following:
·       Name of the field
·       Field properties
·       A usable class names
·       Description of the field investigation for the field type, in case the field type is that of a Text Field

Q4. What do you understand by the term Field Analyzer?
Working with literary information in Solr, Field Analyzer audits and checks the documented content and produces a token stream. The pre-procedure of examining any input content is performed during the time of inquiring or classifying and at inquiry time. Many of the Solr applications utilize Custom Analyzers characterized by clients. However, it is essential to keep in mind that every Analyzer has just a single Tokenizer.
Field analyzers are used both during ingestion, when a document is indexed, and at query time. An analyzer examines the text of fields and generates a token stream. Analyzers may be a single class or they may be composed of a series of tokenizer and filter classes.
Tokenizers break field data into lexical units, or tokens.
Filters examine a stream of tokens and keep them, transform or discard them, or create new ones. Tokenizers and filters may be combined to form pipelines, or chains, where the output of one is input to the next. Such a sequence of tokenizers and filters is called an analyzer and the resulting output of an analyzer is used to match query results or build indices.

Q5. List the various categories of highlighters.
Different categories of highlighters available in Apache Solr include the following:
Standard Highlighter: gives exact matches even to innovative query parsers.
FastVector Highlighter: Though less progressed in comparison to Standard Highlighter, it works better for more dialects and promotes Unicode break iterators.
Postings Highlighter: One of the most precise, compact and effective highlighter categories in comparison to other vectors. However, inappropriate for a progressive number of question terms.

Q6. What does the term Highlighting refer?
Highlighting is only the fragmentation of records relating to the client's question that is incorporated into the Query reaction. A short time later, these parts are shown and set in the unique portion, that is utilized by the clients and customers to exhibit the pieces. The Solr contains various featuring utilities and has power overdifferent fields. The featuring utilities can be called by Handlers of Request and can be reused with the standard question parsers.

Q7. How can one utilize Apache Solr for achieving maximum potential for performance?
Solr can accomplish quick inquiry reactions in light of the fact that, rather than looking through the content legitimately, it looks through a record. This resembles recovering pages in a book identified with a catchphrase by checking the file at the back of a book, rather than looking through each expression of each page of the book.

Q8. List and describe the various building blocks of Apache Solr.
The chief building blocks associated with Apache Solr include the following:
Request Handler: A request handler is used in order to process various queries that might be related to updating or other features. Based on the requirement of the user, from a variety of request handlers, themost appropriate one can be picked to do the job.
Search Component: Search Component is a special feature that allows searching for different facilities within Apache Solr. These facilities might include spell checks, faceting, highlighting, etc. that might be particularly required by the user.
Query Parser: This building block of Apache Solr helps in the verification of different queries for specific syntactical errors. Once the error has been resolved then it is modified to a format that is acceptable by Lucene
Response Writer: Response Writer in Apache Solr generates various outputs of different formats for each query place by the user. Numerous formats supported by Apache Solr include JSON, XML, CSV, and so on. Each type of response has a different response writer assigned to it.
Analyzer/Tokenizer: Data is recognized by data in the format of tokens. These token that is analyzed and segregated to different contents by Apache Solr is then passed onto Lucene. The role of the Tokenizer is to then break the stream of tokens that is organized by the analyzer as tokens.
Update Request Processor: When an update is sent as an appeal to Apache Solr, then this particular request is run via a range of different plugins that are jointly named as update request processor.

Q9. List the different types of Fields that are used in Apache Solr.
The different type of Fields used in Apache Solr include the following:
·       date
·       double
·       float
·       long
·       Text

Q10. What do you infer by the term Dynamic Fields with respect to Apache Solr?
During times when a user neglected to characterize some important field then dynamic fields are only the ideal decision to consider. One can make different dynamic fields together and they are profoundly adaptable in ordering fields that are not uniquely characterized in the pattern.

Q11. Explain the term SolrCloud.
Apache Solr incorporates the capacity to set up a group of Solr servers that consolidates adaptation to noncritical failure and high accessibility is Called SolrCloud. These abilities give circulated ordering and hunt capacities and the accompanying highlights:
·       Central arrangement for the whole group
·       Automatic burden adjusting and flop over for inquiries
·       ZooKeeper combination for group coordination and setup.
In other terms, SolrCloud is adaptable circulated pursuit and order, without an ace hub to assign hubs, shards,and reproductions. Rather, Solr utilizes ZooKeeper to deal with these areas, contingent upon setup records and diagrams. Archives can be sent to any server and ZooKeeper will make sense of it.

Q12. List the various categories of query parameters used in Apache Solr.
The various categories of query parameters used in Apache Solr include the following:
fl: stipulates the list of various fields that are required to be returned to each document within the result
fq: represents a set of filter queries that are filled by Apache Solr within strict bounds for the best result to be obtained for various documents
rows: represents the exact number of various documents that need to be recovered per page; the default number is 10
start: represents the initial offset for a particular page, the default number is 0
sort: indicates the rundown of fields isolated by commas, in light of which the aftereffects of the question is to be arranged
q: this is the fundamental inquiry parameter of Apache Solr, the archives are scored by their closeness to terms in this parameter
wt: represents the kind of the reaction the user needs to see the outcome

Q13. List the various configuration files used by Apache Solr.
The various configuration files used by Apache Solr include the following:
Solr.xml - This record is in $SOLR_HOME index and is composed of Solr Cloud related data.
Schema.xml - It constitutes the entire schema.
Solrconfig.xml - It incorporates the definitions and center explicit setups identified with solicitation taking care of and reaction organizing.
Core.properties - This record contains the arrangements explicit profoundly.

Q14. What do you understand by the term Apache Solr core?
Apache Solr Core is a functioning occurrence of a Lucene list that is composed of all the Solr arrangement records. Solr core should be made to perform activities like analyzing and recording. Solr application may contain one or different centers. On the off chance that core might require two centers in a Solr application have the leverage to communicate with one another.

Q15. Features of Apache Solr
1. Permits Scalable, superior ordering Near ongoing ordering
2. Standard levels provide open interfaces such as XML, HTTP, and JSON
3. Adaptable and versatile faceting
4. Progressed and precise full – content exploration
5. Directly adaptable, auto list replication, auto failover, and recuperation
6. Permits simultaneous examination and refreshing.
7. Complete HTML organization interfaces
8. Gives cross – stage arrangements that are compatible with different files

Q16. Pros of Apache Solr
1. Easy access Apache Solr: Regardless of whether it is handling a setup issue or attempting to become familiar with a portion of the further developed highlights, there are a lot of assets to enable you to go out and make you go.
2. Excellent performance: Apache Solr takes into consideration a ton of custom tuning (if necessary) and gives extraordinary out of the crate execution for seeking on expansive informational collections.
3. Maintenance: Subsequent to setting up Solr in a generation domain there are a lot of devices given to enable you to keep up and update your application. Apache Solr accompanies extraordinary adaptation to non-critical failure worked in and has turned out to be entirely solid.

Q17. Cons of Apache Solr
1. An ordering of information can once in a while be a trudge, which means it can here and there require a significant stretch of time to get a huge accumulation fully operational in the event that you have numerous fields that should be recorded.

Q18. What is SolrJ?

Q19.Can you compare the features of Apache Solr vs Elasticsearch?