Search Endeca: CAS

Showing posts with label CAS. Show all posts

Friday, 9 September 2016

Importing Endeca CAS Data

CAS Syed Ghouse Habib September 09, 2016

Some Times in our Local Machine/SIT Machine where less infrastructure is maintained, Indexing takes lot of time, when it fails in the Middle Due to the Network issue or Endeca Script Service Fail or Unknowingly stopped the Server. Then you have to wait for another couple of hours for indexing. Follow the Below Approach, so that your time is saved.

Step 1:

go to C:\Endeca\CAS\11.1.0\bin

and Execute below command

recordstore-cmd.bat read-baseline -a ATGen-data -f C:\Endeca\CAS\11.1.0\bin\ATGen.xml

where -a tag defines your Record Store here Data to be imported. -f defines the File where you need to store data from the CAS.

It will take some minutes based on the data Size, but not the Actual time it takes for the Indexing from productCatalogSimpleindexingAdmin.

Once it is done, follow the same process for the Dimensions also

recordstore-cmd.bat read-baseline -a ATGen-dimvals -f C:\Endeca\CAS\11.1.0\bin\ATGen-dim.xml

It will take very less time .

Then go to C:\Endeca\Apps\ATGen\test_data\baseline

Step 2:

Copy both the Xml’s and Rename to ATGen.xml to rs_baseline_data.xml

ATGen-dim.xml to rs_baseline_dimvals.xml

Step 3:

Then go to C:\Endeca\PlatformServices\11.1.0\utilities

execute the below command

gzip.exe C:\Endeca\Apps\ATGen\test_data\baseline\rs_baseline_data.xml

gzip.exe C:\Endeca\Apps\ATGen\test_data\baseline\rs_baseline_dimvals.xml

After this command execution it changes to the Below Format

rs_baseline_data.xml.gz

rs_baseline_dimvals.xml.gz

Step 4:

Once it is Changed go to C:\Endeca\Apps\ATGen\control and Execute the Command

load_baseline_test_data.bat

It will load the Data to the RecordStores.

Using CAS install at C:\Endeca\CAS\11.1.0

Loading C:\Endeca\Apps\ATGen\control\..\test_data\baseline\rs_baseline_data.xml.gz into ATGen-data

Wrote 27968 records.

Loading C:\Endeca\Apps\ATGen\control\..\test_data\baseline\rs_baseline_dimvals.xml.gz into ATGen-dimvals

Wrote 104 records.

Once it is Done !!!

Trigger BaseLineindex.

Step 5:

baseline_update.bat

Then indexing is complete . Happy Time Saving !!!

note: This Can be followed as much time you need, when you are getting any CAS related Issued During the CAS import or Baseline Index Restart CAS.

Thursday, 8 September 2016

About index_config_cmd.bat

CAS Syed Ghouse Habib September 08, 2016

During baseline update processing, the Content Acquisition System merges and processes index configuration

from all owners into a consolidated set of MDEX-compatible output files.

If multiple import owners modify the same attribute, the configuration from the system owner always overrides

other import owners during the merge process.

Setting up Index-config

Make sure your file C:\Endeca\Apps\ATGen\config\index_config/index-config.json has no Parsing Errors

Then Execute the Below Command

Windows

Set-config

C:\Endeca\Apps\ATGen\control> and Execute

index_config_cmd.bat set-config -f C:\Endeca\Apps\ATGen\config\index_config\index-config.json -o all

get-config

index_config_cmd.bat get-config

Defining Precedence Rule

"precedenceRules" : {

"ServicesRule" : {

"targetDimension" : "serviceTypes",

"triggerDimension" : "Services",

"isLeafTrigger" : false

}

when we are defining the Precedence Rule, we are defining that the ServiceTypes Dimension is trigerred only when the Dimension Services got Selected.

Enabling Record Filters and Rollup key in ATG /Endeca 11.1

go to C:\Endeca\Apps\ATGen\config\index_config\index-config.json

"product.repositoryId": {

"propertyDataType": "ALPHA",

"mergeAction": "UPDATE",

"isRollupKey": true, ---> Field for Enabling Rollup key

"jcr:primaryType": "endeca:property",

"isRecordFilterable": True --->Field for Enabling the Record Filter.

}

Once the Following Fields are included and set, then it is enabled for the Current EAC application.

Tuesday, 22 December 2015

Basics of EAC Application

Installation Syed Ghouse Habib December 22, 2015

About Record Store and Record Generations

A set of records that has been committed to a Record Store instance is a record generation.

For example, if you perform a full file system crawl, all the records returned from the crawl are written to the

Record Store and a commit is done. After the commit is done, the Record Store has one generation of records.

A subsequent crawl, either full or incremental, results in a second generation of records.

Each record that is read in contains a unique ID. CAS uses that unique ID as the value of the idPropertyName

Record Store configuration property.

If a record already exists with that unique ID during later CAS crawls, then the later version replaces the earlier

one. This ensures that when you run an incremental crawl, you always get the latest version of any given

record.

A record generation is removed from a Record Store instance by the clean task after the generation becomes

stale. A stale generation is a generation that has been in a Record Store instance for a period of time that

exceeds the value of the generationRetentionTime Record Store configuration property.

Command for creating the RecordStore

C:\Endeca\CAS\11.1.0\bin\cas-cmd.bat createDimensionValueIdManager -h localhost -p 8500 -m ATGen-dimension-value-id-manager

with the CAS we can do the following tasks

C:\Endeca\CAS\<version>\bin>cas-cmd.bat --help

usage: cas-cmd <task-name> [options]

[Inspecting Installed Modules]

getAllModuleSpecs

getModuleSpec

listModules

[Managing Crawls]

createCrawls

deleteCrawl

getAllCrawls

getCrawl

getCrawlIncrementalSupport

listCrawls

startCrawl

stopCrawl

updateCrawls

[Managing Dimension Value Ids]

createDimensionValueIdManager

deleteDimensionValueIdManager

exportDimensionValueIdMappings

generateDimensionValueId

getDimensionValueId

importDimensionValueIdMappings

[Viewing Crawl Status and Results]

getAllCrawlMetrics

getCrawlMetrics

getCrawlStatus

About index_config_cmd.bat

During baseline update processing, the Content Acquisition System merges and processes index configuration

from all owners into a consolidated set of MDEX-compatible output files.

If multiple import owners modify the same attribute, the configuration from the system owner always overrides

other import owners during the merge process.

The Index Configuration Command-line Utility writes and reads index configuration as JSON. The schema for

the JSON file varies depending on whether you retrieve configuration for one owner or more than one owner

and whether you restrict the types of configuration that you retrieve.

Types of configuration include:

• Endeca properties, derived properties, and dimensions. These are specified under the attributes node.

• Precedence rules. These are specified under the precedenceRules node.

• Search configuration. These are specified under the searchIndexConfig node.

CAS-based data processing

The Deployment Template supports running baseline and partial updates using CAS as a replacement for

Forge. In this processing model, the update runs a CAS crawl to produce MDEX-compatible output. This is

the step that removes the need for Forge. Then the update runs Dgidx and updates the Dgraphs in an application.

Dgraph baseline update script using CAS

You do not need to run Forge if you run a CAS crawl that is configured to produce MDEX-compatible output

as part of your update process.

This example runs a baseline update that includes a full CAS crawl. The crawl writes MDEX compatible output

and then the update invokes Dgidx to process the records, dimensions, and index configuration produced by

the crawl. To create this sequential workflow of CAS crawl and then baseline update, you add a call to run¬

BaselineCasCrawl() to run the CAS crawl.

Initial setup.xml

Importing dimension value Id mappings

The importDimensionValueIdMappings task imports dimension value Id mappings from a CSV file into

a Dimension Value Id Manager. The restore process completely replaces all dimension value Id mappings

stored in the Dimension Value Id Manager.

The syntax for this task is:

cas-cmd importDimensionValueIdMappings [-h HostName] [-l true|false] -m dvalmgr

[-p PortNumber] -f mappings.csv

DataIngest.xml

Specifies data processing scripts, including the baseline update script, partial update

script, and the components to perform data processing such as CAS or Forge and Dgidx.

The Record_spec.xml

This file contains a RECORD_SPEC element that specifies a property to identify records during partial updates.

When implementing partial updates, the RECORD_SPEC element uses this property to preserve stable record IDs across baseline runs. That is, a record will have the same ID in the next update as in the current update. For more information, see the Endeca MDEX Engine Partial Updates Guide.

Search Endeca

Friday, 9 September 2016

Importing Endeca CAS Data

Thursday, 8 September 2016

About index_config_cmd.bat

Tuesday, 22 December 2015

Basics of EAC Application

Labels

Subscribe to Newsletter

About Me

Total Pageviews

Popular Posts

Labels