Home > Model Nodes > Association Node > Classification Node > Edit Classification Build Node > Classification Node Properties > Explicit Feature Extraction... > Edit Explicit Feature Extra...
When you create an Explicit Feature Extraction node, an ESA model with the default algorithm settings is added. You can add additional ESA models and edit them in the Edit Explicit Feature Extraction Node dialog box.
The Edit Explicit Feature Extraction Node dialog box comprises the following tabs:
Build: You can add additional models in the Build tab. In the Topic ID field, select an attribute for building the model. You can perform the following tasks:
Add Model: Click
to add a model.
Edit Advanced Model Settings: Click
to edit model settings.
Delete Models: Select a model and click
.
Duplicate a model: Select a model and click
.
Partition: Partition is supported for ESA model. In the Partition tab, you can specify the number of partition columns. For example, a Wikipedia model build can be partitioned by a Language partition key. You can perform the following tasks:
Maximum Number of Partitions
Sampling: You can set sampling settings of the node.
Select Off
Select On and select either:
System Determined
User Specified and enter a value
Input: You can include or exclude attributes for model build, change mining types (numerical, categorical, text) of input attributes, and enable or disable auto data prep for input attributes. You can also apply Data Miner heuristics rules to the input attributes. By default, the option Determine inputs automatically (using heuristics) is On.
Text: The text tab allows you to specify text settings used for text processing. The text related settings are:
Categorical Cutoff Value: If the maximum number of characteristics contained in a column exceeds the cutoff value, then text processing is done during model build. This applies to categorical mining type columns. The default cutoff value is 200.
Default Transform type: Select any one of the following transform types:
Tokens: If you select token as the default transform type, then set the settings:
Languages
Stemming: Stemming reduces words or tokens to the root words during text processing. If Stemming is enabled, then stemmed words are returned for supported languages. Otherwise, the original words are returned. For example, if a text contain the words SHOPPING and SHOP, then the stemming option returns SHOP SHOP. So, two root words are returned as a result of text processing.
Bigram:
Stoplist: Select a stoplist from the drop-down list. To add Stoplist Definition, click
To edit a selected stoplist, click
.
Max Number of tokens across all rows (documents) Maximum number of distinct features across all text attributes. The value must be greater than or equal to 1. Default value is 3000.
Min number of rows (documents) required for a token: This is a text processing setting that controls in how many documents a token must appear to be used as a feature. The value must be non-negative integer. Default value is 3.
Themes: If you select themes as the default transform type, then set the following settings:
Languages
Stoplists: Select a stoplist from the drop-down list. To add Stoplist Definition, click
To edit a selected stoplist, click
.
Max number of tokens across all rows: