pages moved to Trash not deleted from the searchindex

Hi

Episerver find search result containing items which are alredy deleted(ie pages in the recycle bin)

My query is

var articles = _client.Search()  
.CurrentlyPublished()
.ExcludeDeleted()
.Skip((currentPageNo) * pageSize)
.Take(pageSize)
.GetContentResult();

When I add a new article page it is updated in the episerver find but when I delete a page ,the TotalMatching count still same as before the delete. 

Jaanna

#185583 Edited, Nov 25, 2017 17:41
  • Hi,

    Is it only freshly deleted pages that show in search results?

    GetContentResult applies some caching automatically, however, this should be cleared "whenever Episerver content is saved or deleted".

    You could try testing with GetResult() to see if that works just to narrow your investigation.

    /Jake

    #185694 Nov 30, 2017 2:24
  • Member since: 2008

    (Late reply so this is probably solved by now?)

    Hi

    In my experience FIND still has pages in trashbin in its index. You can get around this by using FilterForVisitor() in the query.

    The documentation around this issue varies alot from version to version. In some cases the recomendation is to add an indexing convention that excludes pages in the trashbin, but to my knowledge those will only prevent the page from ending up in the index for a full reindex to a clean index.

    Early on the documentation stated that pages moved to trash were removed from the index, but ive never seen that happen for any FIND project that I have done.

    /Torbjörn

    #185903 Dec 06, 2017 10:29
  • We are actually experiencing the exact same thing and have gotten to the point where we have put both .ExcludeDeleted() and .FilterForVisitor() in our queries, but we are still getting items back that are either in the transbin and also items that have been removed from the trashbin.

    #185941 Dec 06, 2017 20:50
  • Member since: 2006

    This is how I solved the same issue.
    First take a look here: https://world.episerver.com/blogs/Henrik-Fransas/Dates/2015/5/adding-episerver-find-to-alloy---part-2/

    Then change the ShouldIndexPageData method to something like this:

    private bool ShouldIndexPageData(SolutionPageData page)
    {
                var wastedContent = ServiceLocator.Current.GetInstance<IContentLoader>().GetDescendents(ContentReference.WasteBasket).ToList();
    
                //Check if the page is published, not marked as disable indexing, etc 
                var shouldIndex = page.CheckPublishedStatus(PagePublishedStatus.Published)
                                  && wastedContent.All(c => c.ID != page.PageLink.ID) //content in wastebasket should not be indexed
                                  && !page.DisableIndexing;
    
                //The page should not be indexed, but in some scenarios it might already be indexed, so try to delete it.
                if (!shouldIndex)
                 {
                        ContentIndexer.Instance.TryDelete(page, out var result);         
                }
    
                return shouldIndex;
    }

    Disclaimer: code above is just parts of the full code, might not work as is 

    #185952 Dec 07, 2017 9:23
  • Member since: 2014

    @Erik I solved it in similar fashion as you did.

    However is there a reason you're using the ContentLoader to get all the content in the wastebin instead of checking page.IsDeleted ?

    #186238 Dec 14, 2017 12:25
  • Member since: 2006

    Thanks Peter, now I learned something new. Sometimes you don't see the obvious solutions :-)

    #186239 Dec 14, 2017 12:53
  • For us, this was a situation where we were dealing with thousands of events being fired related to Find this past summer and our queues were filling up, which we eventually found out was related to a completely different issue.   In the process of troubleshooting we set <episerver.find.cms disableEventedIndexing = "true"/> on our authoring server. So for us, we unfortunately caused the issue on our own by having this set.

    #186249 Dec 14, 2017 15:19