Showing posts with label CAS. Show all posts
Showing posts with label CAS. Show all posts

Friday, 9 September 2016

Importing Endeca CAS Data

Some Times in our Local Machine/SIT Machine  where less infrastructure is maintained, Indexing takes lot of time, when it fails in the Middle Due to the Network issue or Endeca Script Service Fail or Unknowingly stopped the Server. Then you have to wait for another couple of hours for indexing. Follow the Below Approach, so that your time is saved.

 Step 1:

go to C:\Endeca\CAS\11.1.0\bin

and Execute below command

recordstore-cmd.bat read-baseline -a ATGen-data -f C:\Endeca\CAS\11.1.0\bin\ATGen.xml


where -a tag defines your Record Store here Data to be imported. -f defines the File where you need to store data from the CAS.

It will take some minutes based on the data Size, but not the Actual time it takes for the Indexing from productCatalogSimpleindexingAdmin.

Once it is done, follow the same process for the Dimensions also

recordstore-cmd.bat read-baseline -a ATGen-dimvals -f C:\Endeca\CAS\11.1.0\bin\ATGen-dim.xml

It will take very less time .

Then go to C:\Endeca\Apps\ATGen\test_data\baseline

Step 2:

Copy both the Xml’s and Rename to ATGen.xml to rs_baseline_data.xml

ATGen-dim.xml to rs_baseline_dimvals.xml

Step 3:

Then go to C:\Endeca\PlatformServices\11.1.0\utilities

execute the below command

gzip.exe C:\Endeca\Apps\ATGen\test_data\baseline\rs_baseline_data.xml

gzip.exe C:\Endeca\Apps\ATGen\test_data\baseline\rs_baseline_dimvals.xml

After this command execution it changes to the Below Format


rs_baseline_data.xml.gz

rs_baseline_dimvals.xml.gz

Step 4:

Once it is Changed go to C:\Endeca\Apps\ATGen\control and Execute the Command 

load_baseline_test_data.bat

It will load the Data to the RecordStores.

Using CAS install at C:\Endeca\CAS\11.1.0
Loading C:\Endeca\Apps\ATGen\control\..\test_data\baseline\rs_baseline_data.xml.gz into ATGen-data
Wrote 27968 records.
Loading C:\Endeca\Apps\ATGen\control\..\test_data\baseline\rs_baseline_dimvals.xml.gz into ATGen-dimvals
Wrote 104 records.

Once it is Done !!!
 Trigger BaseLineindex.

Step 5:

baseline_update.bat

Then indexing is complete . Happy Time Saving !!!

note: This Can be followed as much time you need, when you are getting any CAS related Issued During the CAS import or Baseline Index Restart CAS.




Thursday, 8 September 2016

About index_config_cmd.bat

During baseline update processing, the Content Acquisition System merges and processes index configuration
from all owners into a consolidated set of MDEX-compatible output files.
If multiple import owners modify the same attribute, the configuration from the system owner always overrides
other import owners during the merge process.


Setting up Index-config

Make sure your file C:\Endeca\Apps\ATGen\config\index_config/index-config.json has no Parsing Errors

Then Execute the Below Command

Windows

Set-config

C:\Endeca\Apps\ATGen\control> and Execute

index_config_cmd.bat set-config -f C:\Endeca\Apps\ATGen\config\index_config\index-config.json -o all

get-config

index_config_cmd.bat get-config



Defining Precedence Rule


"precedenceRules" : {
 "ServicesRule" : {
 "targetDimension" : "serviceTypes",
 "triggerDimension" : "Services",
 "isLeafTrigger" : false
 }


when we are defining the Precedence Rule, we are defining that the ServiceTypes Dimension is trigerred only when the Dimension Services got Selected.


Enabling Record Filters and Rollup key in ATG /Endeca 11.1

go to C:\Endeca\Apps\ATGen\config\index_config\index-config.json


"product.repositoryId": {
  "propertyDataType": "ALPHA",
  "mergeAction": "UPDATE",
  "isRollupKey": true, ---> Field for Enabling Rollup key
  "jcr:primaryType": "endeca:property",
  "isRecordFilterable": True --->Field for Enabling the Record Filter.
  }

Once the Following Fields are included and set, then it is enabled for the Current EAC application.


Tuesday, 22 December 2015

Basics of EAC Application


About Record Store and Record Generations

A set of records that has been committed to a Record Store instance is a record generation.
For example, if you perform a full file system crawl, all the records returned from the crawl are written to the
Record Store and a commit is done. After the commit is done, the Record Store has one generation of records.
A subsequent crawl, either full or incremental, results in a second generation of records.
Each record that is read in contains a unique ID. CAS uses that unique ID as the value of the idPropertyName
Record Store configuration property.
If a record already exists with that unique ID during later CAS crawls, then the later version replaces the earlier
one. This ensures that when you run an incremental crawl, you always get the latest version of any given
record.
A record generation is removed from a Record Store instance by the clean task after the generation becomes
stale. A stale generation is a generation that has been in a Record Store instance for a period of time that
exceeds the value of the generationRetentionTime Record Store configuration property.

Command for creating the RecordStore
C:\Endeca\CAS\11.1.0\bin\cas-cmd.bat createDimensionValueIdManager -h localhost -p 8500 -m ATGen-dimension-value-id-manager

with the CAS we can do the following tasks
C:\Endeca\CAS\<version>\bin>cas-cmd.bat --help
usage: cas-cmd <task-name> [options]
[Inspecting Installed Modules]
 getAllModuleSpecs
 getModuleSpec
 listModules
[Managing Crawls]
 createCrawls
 deleteCrawl
 getAllCrawls
 getCrawl
 getCrawlIncrementalSupport
 listCrawls
 startCrawl
 stopCrawl
 updateCrawls
[Managing Dimension Value Ids]
 createDimensionValueIdManager
 deleteDimensionValueIdManager
 exportDimensionValueIdMappings
 generateDimensionValueId
 getDimensionValueId
 importDimensionValueIdMappings
[Viewing Crawl Status and Results]
 getAllCrawlMetrics
 getCrawlMetrics
  getCrawlStatus

About index_config_cmd.bat
  During baseline update processing, the Content Acquisition System merges and processes index configuration
from all owners into a consolidated set of MDEX-compatible output files.
If multiple import owners modify the same attribute, the configuration from the system owner always overrides
other import owners during the merge process.

The Index Configuration Command-line Utility writes and reads index configuration as JSON. The schema for
the JSON file varies depending on whether you retrieve configuration for one owner or more than one owner
and whether you restrict the types of configuration that you retrieve.
Types of configuration include:
• Endeca properties, derived properties, and dimensions. These are specified under the attributes node.
• Precedence rules. These are specified under the precedenceRules node.
• Search configuration. These are specified under the searchIndexConfig node.


CAS-based data processing
The Deployment Template supports running baseline and partial updates using CAS as a replacement for
Forge. In this processing model, the update runs a CAS crawl to produce MDEX-compatible output. This is
the step that removes the need for Forge. Then the update runs Dgidx and updates the Dgraphs in an application.
Dgraph baseline update script using CAS
You do not need to run Forge if you run a CAS crawl that is configured to produce MDEX-compatible output
as part of your update process.
This example runs a baseline update that includes a full CAS crawl. The crawl writes MDEX compatible output
and then the update invokes Dgidx to process the records, dimensions, and index configuration produced by
the crawl. To create this sequential workflow of CAS crawl and then baseline update, you add a call to run¬
BaselineCasCrawl() to run the CAS crawl.

Initial setup.xml
Importing dimension value Id mappings
The importDimensionValueIdMappings task imports dimension value Id mappings from a CSV file into
a Dimension Value Id Manager. The restore process completely replaces all dimension value Id mappings
stored in the Dimension Value Id Manager.
The syntax for this task is:
cas-cmd importDimensionValueIdMappings [-h HostName] [-l true|false] -m dvalmgr
[-p PortNumber] -f mappings.csv

DataIngest.xml
 Specifies data processing scripts, including the baseline update script, partial update
script, and the components to perform data processing such as CAS or Forge and Dgidx.

The Record_spec.xml 
This file contains a RECORD_SPEC element that specifies a property to identify records during partial updates.
When implementing partial updates, the RECORD_SPEC element uses this property to preserve stable record IDs across baseline runs. That is, a record will have the same ID in the next update as in the current update. For more information, see the Endeca MDEX Engine Partial Updates Guide.