Smart Classification

Beginning in FileCloud 23.232, an updated version of the Smart Classification user interface is available. This section of the documentation covers the new user interface. If you prefer to use the classic user interface, see Smart Classification Classic.

Smart Classification is only available for Advanced licenses, or Essentials licenses with CCE+PATTERNSEARCH components. For information on the different license types, read about the key features on the Pricing page.

Smart Classification in FileCloud

FileCloud's Smart Classification system (also referred to as the content classification engine or CCE) searches for files with specific content or content patterns and tags them with metadata values. Once the files are marked with metadata, they can be identified for further actions, such as processing in FileCloud's data leak prevention (DLP) system.

To use Smart Classification, set up rules that search for content in files and apply metadata to them depending on the search results. When rules are initially enabled, they apply to files added before the rules were created. After that, they apply to newly added and uploaded files. 

Example:

You create a rule to mark all files that contain content with 6 consecutive numbers by setting their metadata field CompanyID to yes. Smart Classification  tries to locate instances of 6 consecutive numbers in the content of each new and modified file in FileCloud, and when it finds a match, sets the file's CompanyID metadata field to yes.  Now, FileCloud's Smart DLP can prevent files with CompanyID=yes from being read and downloaded.

Setting up Smart Classification

To set up content classification, you create rules in the Add Content Classification Rule wizard. These rules specify the patterns to match and the metadata to apply to files if a pattern is matched or is not matched.


The saved rules appear in the Smart Classification screen.


When a rule runs, it applies metadata to files with content that match its conditions:


Other FileCloud operations look at this metadata to perform their actions. For example, Smart DLP can prevent a file from being downloaded if CompanyID is set to yes. Or a search can return all files where CompanyID is set to no.

Running content classification rules

To automate and schedule running of content classification rules, you must set up a Cron Job. You can also run a rule manually from the Smart Classification screen..

Requirements

  • Smart classification will only function properly if Solr is configured in your system and your storage has been indexed.
  • Since files greater than 10 MB cannot be indexed by Solr, files greater than 10 MB are not available for Smart Classification.
  • Administrators must have created at least one set of metadata for the Smart Classification process to operate.