|Number of votes:||3|
The way that Episerver Find treats attachments is not optimal in all scenarios. If you are a large company with lots of documents in many languages, you might notice that the search hits are not always optimal when trying to find a document. This is due to how Find handles indexing of attachments.
When indexing an attachment, the file is sent to Find as a base64 encoded string. The string is parsed in Apache Tika, and the resulting text is indexed using the standard language analyzer. This approach creates several issues.
To solve these issues in one go, the Attachment Helper interface was created. This interface lets the developer decide how to handle attachments. Out of the box, there is an implementation created by Episerver using the Windows built-in IFilter features here. This version supports a wide range of file types and is easy to get going.
Install the nuget package and the IFilters that suit your needs and, suddenly, your attachment search experience is vastly improved. You might notice that network traffic is reduced when running the index job, your searches provide more relevant hits, and the Find administrator can view the document content from inside the Find admin UI.
For more details on the search attachment filter, check out the docs.