SAS Content Categorization is designed to develop and deploy categorization and extraction rules to classify unstructured content.
Industry-specific taxonomies can be added to quick-start taxonomy development. With improved graphical reports for precision and recall,
rule definition and refinement is further simplified by using new co-reference operators for pronoun resolution.
Initial categories and subcategories can also be generated from Wikipedia.
SAS Enterprise Content Categorization can be further extended to create a unique environment from a variety of add-on modules.
The modules enable linguistic-based document summarization, document duplication detection, search and indexing,
file- or web-crawling, content alerts, and an editorial workbench.
Please see the SAS Product Documentation page for additional information.