FileCloud is a leading enterprise file access, sync and share solution that provides flexibility to run on-premise or on any IaaS such as AWS, Azure. FileCloud not only has the ability to integrate with cloud storage but also make legacy storage available through its access framework. Additionally, using FileCloud, users can search for files/folders by name, type, date and […]
FileCloud is a leading enterprise file access, sync and share solution that provides flexibility to run on-premise or on any IaaS such as AWS, Azure. FileCloud not only has the ability to integrate with cloud storage but also make legacy storage available through its access framework. Additionally, using FileCloud, users can search for files/folders by name, type, date and size across any storage connected to the system. Furthermore, in v12 release (Q2 2016), FileCloud has increased its search capability by introducing the ability to search files based on its content.
Under the hood, FileCloud uses Apache Solr for searching file contents. This blog explains the interaction between FileCloud and Solr to make content search possible.
Apache Solr
Solr is a highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites.
Solr is a java application that is packaged with all the components to run it as a standalone service or in a cluster.
FileCloud Integration with Apache Solr
FileCloud can be configured to use Solr as a service to be used for content search. Once FileCloud is configured with Solr, all the uploaded documents (content searchable such as DOC, DOCX, XLS, XLSX, PPT, PPX, PDF, TXT) are indexed by Solr.
Content search consists of two major functionalities:
Indexing
The following is the workflow that happens during indexing of incoming documents:
Querying
The following is the workflow that happens during querying of files: