{"id":4746,"date":"2016-07-28T11:39:40","date_gmt":"2016-07-28T16:39:40","guid":{"rendered":"http:\/\/www.filecloud.com\/blog\/?p=4746"},"modified":"2021-09-19T01:55:05","modified_gmt":"2021-09-19T06:55:05","slug":"filecloud-content-search","status":"publish","type":"post","link":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/","title":{"rendered":"File Content Search using FileCloud &#8211; Technical Architecture"},"content":{"rendered":"<p>FileCloud is a leading\u00a0enterprise file access, sync and share solution that provides flexibility to run on-premise or on any IaaS such as AWS, Azure. FileCloud not only\u00a0has the ability to integrate with\u00a0cloud storage but\u00a0also make legacy storage available through its access framework. \u00a0Additionally, using FileCloud, users can search for files\/folders by name, type, date and size across any storage connected to the system. Furthermore, in v12 release (Q2 2016), FileCloud has increased its\u00a0search capability by introducing the ability to search files based on its content.<\/p>\n<p>Under the hood, FileCloud uses Apache Solr for searching file contents. This blog explains the interaction between FileCloud and Solr to make content search possible.<\/p>\n<p><u>Apache Solr<\/u><\/p>\n<p>Solr is a highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world\u2019s largest internet sites.<\/p>\n<p>Solr is a java application that is packaged with all the components to run it as a standalone service or in a cluster.<\/p>\n<p><u>FileCloud Integration with Apache Solr<\/u><\/p>\n<p>FileCloud can be configured to use Solr as a service to be used for content search. Once FileCloud is configured with Solr, all the uploaded documents (content searchable such as DOC, DOCX, XLS, XLSX, PPT, PPX, PDF, TXT) are indexed by Solr.<\/p>\n<p>Content search consists of two major functionalities:\n<\/p>\n<ul>\n<li>Indexing of documents and storage of the indexed information.<\/li>\n<li>Querying the index to retrieve files that match the query criteria.<\/li>\n<\/ul>\n<p><u>Indexing<\/u><\/p>\n<p>The following is the workflow that happens during indexing of incoming documents:<\/p>\n<ol>\n<li>User uploads a file from any of the client access points such as web browser, FileCloud sync, FileCloud drive, mobile apps, WebDAV client etc.,<\/li>\n<li>FileCloud server checks if the incoming document is a parsable document such as word documents, spread sheets, presentations, plain text files and PDF files.<\/li>\n<li>If the document in not parsable, it is processed and stored in FileCloud. A copy of metadata of the document file is send to Apache Solr, which in turn gets stored in the metadata index. The metadata sent to Solr includes file attributes such as name, path, size, creation date etc. Storing this information in Solr enables FileCloud to retrieve files based on these attributes in addition to content.<\/li>\n<li>If the document is parsable in addition to processing and storing in FileCloud, a copy of the document is uploaded to Apache server. Metadata of this document is also sent to Solr.<\/li>\n<li>When a parsable document is uploaded to Solr, it parses the document, extracts the text contents and stores the information in the content Index.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-4747\" src=\"https:\/\/www.filecloud.com\/blog\/wp-content\/uploads\/2016\/07\/FileCloud-Content-Search-Indexing1.png\" alt=\"FileCloud Content Search - Indexing\" width=\"1413\" height=\"915\"><\/p>\n<p><u>Querying<\/u><\/p>\n<p>The following is the workflow that happens during querying of files:<\/p>\n<ol>\n<li>Client such as web UI sends search parameters to FileCloud server. The parameters can be some text string to be searched inside the files and\/or name, path, size etc.<\/li>\n<li>FileCloud server parses the incoming parameters and converts them to a query format the Solr can understand.<\/li>\n<li>Solr query is then submitted to Solr.<\/li>\n<li>Solr executes the query on the metadata and content index. The results are collected and sent back to FileCloud server.<\/li>\n<li>FileCloud server converts the search results into a XML format understandable to FileCloud clients and sends the result back to the client.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-4748\" src=\"https:\/\/www.filecloud.com\/blog\/wp-content\/uploads\/2016\/07\/FileCloud-Content-Search-Querying1.png\" alt=\"FileCloud Content Search - Querying\" width=\"1416\" height=\"924\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>FileCloud is a leading\u00a0enterprise file access, sync and share solution that provides flexibility to run on-premise or on any IaaS such as AWS, Azure. FileCloud not only\u00a0has the ability to integrate with\u00a0cloud storage but\u00a0also make legacy storage available through its access framework. \u00a0Additionally, using FileCloud, users can search for files\/folders by name, type, date and [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v20.13 (Yoast SEO v20.13) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>File Content Search using FileCloud - Technical Architecture - FileCloud blog<\/title>\n<meta name=\"description\" content=\"FileCloud uses Apache Solr for searching file contents. This post explains interactions between FileCloud and Solr to make content search possible.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"File Content Search using FileCloud - Technical Architecture\" \/>\n<meta property=\"og:description\" content=\"FileCloud uses Apache Solr for searching file contents. This post explains interactions between FileCloud and Solr to make content search possible.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/\" \/>\n<meta property=\"og:site_name\" content=\"FileCloud blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/tonidopage\" \/>\n<meta property=\"article:published_time\" content=\"2016-07-28T16:39:40+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-09-19T06:55:05+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.filecloud.com\/blog\/wp-content\/uploads\/2016\/07\/FileCloud-Content-Search-Indexing1.png\" \/>\n<meta name=\"author\" content=\"Team FileCloud\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@getfilecloud\" \/>\n<meta name=\"twitter:site\" content=\"@getfilecloud\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Team FileCloud\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/\"},\"author\":{\"name\":\"Team FileCloud\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/#\/schema\/person\/8a8df071f564aa2c10fa07d6ce60c935\"},\"headline\":\"File Content Search using FileCloud &#8211; Technical Architecture\",\"datePublished\":\"2016-07-28T16:39:40+00:00\",\"dateModified\":\"2021-09-19T06:55:05+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/\"},\"wordCount\":555,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.filecloud.com\/blog\/#organization\"},\"articleSection\":[\"Enterprise File Sharing\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/\",\"url\":\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/\",\"name\":\"File Content Search using FileCloud - Technical Architecture - FileCloud blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.filecloud.com\/blog\/#website\"},\"datePublished\":\"2016-07-28T16:39:40+00:00\",\"dateModified\":\"2021-09-19T06:55:05+00:00\",\"description\":\"FileCloud uses Apache Solr for searching file contents. This post explains interactions between FileCloud and Solr to make content search possible.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.filecloud.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"File Content Search using FileCloud &#8211; Technical Architecture\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/#website\",\"url\":\"https:\/\/www.filecloud.com\/blog\/\",\"name\":\"FileCloud blog\",\"description\":\"Topics on Private cloud, On-Premises, Self-Hosted, Enterprise File Sync and Sharing\",\"publisher\":{\"@id\":\"https:\/\/www.filecloud.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.filecloud.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/#organization\",\"name\":\"FileCloud\",\"url\":\"https:\/\/www.filecloud.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.filecloud.com\/blog\/wp-content\/uploads\/2016\/02\/filecloud_logo_comparison.jpg\",\"contentUrl\":\"https:\/\/www.filecloud.com\/blog\/wp-content\/uploads\/2016\/02\/filecloud_logo_comparison.jpg\",\"width\":155,\"height\":40,\"caption\":\"FileCloud\"},\"image\":{\"@id\":\"https:\/\/www.filecloud.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/tonidopage\",\"https:\/\/twitter.com\/getfilecloud\",\"https:\/\/www.linkedin.com\/company\/codelathe\",\"https:\/\/www.pinterest.com\/filecloud\/filecloud\/\",\"https:\/\/www.youtube.com\/channel\/UCbU5gTFdNCPESA5aGipFW6g\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/#\/schema\/person\/8a8df071f564aa2c10fa07d6ce60c935\",\"name\":\"Team FileCloud\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.filecloud.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/b5818ab931b69298f500d8a184fd2384?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/b5818ab931b69298f500d8a184fd2384?s=96&d=mm&r=g\",\"caption\":\"Team FileCloud\"},\"sameAs\":[\"http:\/\/www.filecloud.com\"],\"url\":\"https:\/\/www.filecloud.com\/blog\/author\/admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"File Content Search using FileCloud - Technical Architecture - FileCloud blog","description":"FileCloud uses Apache Solr for searching file contents. This post explains interactions between FileCloud and Solr to make content search possible.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/","og_locale":"en_US","og_type":"article","og_title":"File Content Search using FileCloud - Technical Architecture","og_description":"FileCloud uses Apache Solr for searching file contents. This post explains interactions between FileCloud and Solr to make content search possible.","og_url":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/","og_site_name":"FileCloud blog","article_publisher":"https:\/\/www.facebook.com\/tonidopage","article_published_time":"2016-07-28T16:39:40+00:00","article_modified_time":"2021-09-19T06:55:05+00:00","og_image":[{"url":"https:\/\/www.filecloud.com\/blog\/wp-content\/uploads\/2016\/07\/FileCloud-Content-Search-Indexing1.png"}],"author":"Team FileCloud","twitter_card":"summary_large_image","twitter_creator":"@getfilecloud","twitter_site":"@getfilecloud","twitter_misc":{"Written by":"Team FileCloud","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/#article","isPartOf":{"@id":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/"},"author":{"name":"Team FileCloud","@id":"https:\/\/www.filecloud.com\/blog\/#\/schema\/person\/8a8df071f564aa2c10fa07d6ce60c935"},"headline":"File Content Search using FileCloud &#8211; Technical Architecture","datePublished":"2016-07-28T16:39:40+00:00","dateModified":"2021-09-19T06:55:05+00:00","mainEntityOfPage":{"@id":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/"},"wordCount":555,"commentCount":0,"publisher":{"@id":"https:\/\/www.filecloud.com\/blog\/#organization"},"articleSection":["Enterprise File Sharing"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/","url":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/","name":"File Content Search using FileCloud - Technical Architecture - FileCloud blog","isPartOf":{"@id":"https:\/\/www.filecloud.com\/blog\/#website"},"datePublished":"2016-07-28T16:39:40+00:00","dateModified":"2021-09-19T06:55:05+00:00","description":"FileCloud uses Apache Solr for searching file contents. This post explains interactions between FileCloud and Solr to make content search possible.","breadcrumb":{"@id":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.filecloud.com\/blog\/filecloud-content-search\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.filecloud.com\/blog\/"},{"@type":"ListItem","position":2,"name":"File Content Search using FileCloud &#8211; Technical Architecture"}]},{"@type":"WebSite","@id":"https:\/\/www.filecloud.com\/blog\/#website","url":"https:\/\/www.filecloud.com\/blog\/","name":"FileCloud blog","description":"Topics on Private cloud, On-Premises, Self-Hosted, Enterprise File Sync and Sharing","publisher":{"@id":"https:\/\/www.filecloud.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.filecloud.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.filecloud.com\/blog\/#organization","name":"FileCloud","url":"https:\/\/www.filecloud.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.filecloud.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.filecloud.com\/blog\/wp-content\/uploads\/2016\/02\/filecloud_logo_comparison.jpg","contentUrl":"https:\/\/www.filecloud.com\/blog\/wp-content\/uploads\/2016\/02\/filecloud_logo_comparison.jpg","width":155,"height":40,"caption":"FileCloud"},"image":{"@id":"https:\/\/www.filecloud.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/tonidopage","https:\/\/twitter.com\/getfilecloud","https:\/\/www.linkedin.com\/company\/codelathe","https:\/\/www.pinterest.com\/filecloud\/filecloud\/","https:\/\/www.youtube.com\/channel\/UCbU5gTFdNCPESA5aGipFW6g"]},{"@type":"Person","@id":"https:\/\/www.filecloud.com\/blog\/#\/schema\/person\/8a8df071f564aa2c10fa07d6ce60c935","name":"Team FileCloud","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.filecloud.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/b5818ab931b69298f500d8a184fd2384?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/b5818ab931b69298f500d8a184fd2384?s=96&d=mm&r=g","caption":"Team FileCloud"},"sameAs":["http:\/\/www.filecloud.com"],"url":"https:\/\/www.filecloud.com\/blog\/author\/admin\/"}]}},"_links":{"self":[{"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/posts\/4746"}],"collection":[{"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/comments?post=4746"}],"version-history":[{"count":12,"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/posts\/4746\/revisions"}],"predecessor-version":[{"id":32421,"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/posts\/4746\/revisions\/32421"}],"wp:attachment":[{"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/media?parent=4746"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/categories?post=4746"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.filecloud.com\/blog\/wp-json\/wp\/v2\/tags?post=4746"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}