Views: 1078
Number of votes: 9
Average rating:

Rebuild the index for selected sites in Episerver Find | Admin Tool

Hi

Last Friday, I wrote a blog post related to "Reindex a target site in Find" using is works job. It works well but you need to update the site definition every time when you want to rebuild the indexes for any site.

So I received some feedback to convert it to Episerver Admin Tool and now I converted it to Episerver Admin Tool. Where you can rebuild the indexes for selected sites.

Here is the final structure of my solution.

To create a new GUI Plugin Episerver provide a template for Webforms but not for MVC so you need to create it manually. Below I mentioned the steps for creating a GUI plugin using MVC.

FYI - You can refer this blog post to create a custom GUI Plugin using MVC

Create a Controller

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Web.Mvc;
using EPiServer.Find.Cms;
using EPiServer.Find.Helpers.Text;
using EPiServer.PlugIn;
using EPiServer.ServiceLocation;
using EPiServer.Web;
using ReindexTargetSite_AdminTool.AdminTools.FindIndexPlugin.ViewModels;

namespace ReindexTargetSite_AdminTool.AdminTools.FindIndexPlugin
{
    [GuiPlugIn(
        Area = PlugInArea.AdminMenu,
        Url = "/custom-plugins/my-plugin",
        DisplayName = "Rebuild Find Index")]
    [Authorize(Roles = "CmsAdmins")]
    public class RebuildFindIndexController : Controller
    {
        public static string Message { get; set; }
        public static string ExecutionCompleteMessage { get; set; }

        private ISiteDefinitionRepository _siteDefinitionRepository;
        public RebuildFindIndexController(ISiteDefinitionRepository siteDefinitionRepository)
        {
            _siteDefinitionRepository = siteDefinitionRepository ?? ServiceLocator.Current.GetInstance<ISiteDefinitionRepository>();

        }
        public ActionResult Index()
        {
            var siteDefinitions = _siteDefinitionRepository.List();
            var siteList = new List<SiteDefinition>();
            if (siteDefinitions.Any())
            {

                foreach (var site in siteDefinitions)
                {
                    siteList.Add(site);
                }

            }

            var model = new RebuildFindIndexViewModel
            {
                Sites = siteList
            };
            return View("~/AdminTools/FindIndexPlugin/Views/Index.cshtml", model);
        }

        [HttpPost]
        public async Task<ActionResult> InitiateRebuildIndex(Guid[] selectedObjects)
        {
            Message = null;
            ExecutionCompleteMessage = null;

            string selectedSite = Request.Form["SelectedSite"];

            _ = Task.Run(() => StartRebuild(selectedObjects));
            return View("~/AdminTools/FindIndexPlugin/Views/Index.cshtml");
        }

        private void StartRebuild(Guid[] selectedSite)
        {
            foreach (var site in selectedSite)
            {
                SiteDefinition.Current = _siteDefinitionRepository.List().FirstOrDefault(i => i.Id.Equals(site));

                if (SiteDefinition.Current != null && !string.IsNullOrEmpty(SiteDefinition.Current.Name))
                {
                    var statusReport = new StringBuilder();

                    // ReIndex the indexes for the sites
                    ContentIndexer.ReIndexResult reIndexResult = ContentIndexer.Instance.ReIndex(
                        status =>
                        {
                            if (status.IsError)
                            {
                                string errorMessage = status.Message.StripHtml();
                                if (errorMessage.Length > 0)
                                    statusReport.Append($"{errorMessage}");
                            }

                            Message =
                                $"Indexing job [{(SiteDefinition.Current.Name)}] [content]: {status.Message.StripHtml()}";
                        },
                        () => false);
                }
            }

            ExecutionCompleteMessage = Message;
        }
        [HttpGet]
        public ActionResult GetMessage()
        {
            return Json(new { RunningMessage = Message, StopExecution = ExecutionCompleteMessage }, JsonRequestBehavior.AllowGet);
        }
    }
}

Create a ViewModel

using System;
using System.Collections.Generic;
using EPiServer.Web;

namespace ReindexTargetSite_AdminTool.AdminTools.FindIndexPlugin.ViewModels
{
    public class RebuildFindIndexViewModel
    {
        public IEnumerable<Guid> SelectedSites { get; set; }
        public IEnumerable<SiteDefinition> Sites { get; set; }
    }
}

Create a View

@using System.Web.Mvc
@using System.Web.Mvc.Html
@inherits System.Web.Mvc.WebViewPage<ReindexTargetSite_AdminTool.AdminTools.FindIndexPlugin.ViewModels.RebuildFindIndexViewModel>
@{
    Layout = null;
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
<script type="text/javascript">
    var messageInterval = setInterval(function () {
        $.get('/custom-plugins/my-plugin/get-message').done(function (result) {
            $(".job-started").show();
            if (result.StopExecution == null) {
                $('#runningStatus').html(result.RunningMessage);

            } else {
                $('#runningStatus').html(result.StopExecution);
                clearInterval(messageInterval);
                $(".job-started").html("Index rebuild successfully");
                $("#runningStatus").hide();
            }
        });
    }, 10000)
</script>
@if (Model != null && Model.Sites != null && Model.Sites.Any())
{
    <h2>Site Listing</h2>

    using (Html.BeginForm("InitiateRebuildIndex", "RebuildFindIndex", FormMethod.Post))
    {

        foreach (var site in Model.Sites)
        {
            <input type="checkbox" title="@site.Name" name="selectedObjects" value="@site.Id">
            <label for="selectedObjects">@site.Name</label>
            <br />

        }
        @*@Html.DropDownList("SelectedSite", new SelectList(Model.Sites, "Value", "Text"))*@
        <input type="submit" value="Rebuild" />
    }
}
else
{
    <h2 class="job-started">Schedule job started</h2>
    <div id="runningStatus"></div>
}

Create an initialization Module 

using System.Web.Mvc;
using System.Web.Routing;
using EPiServer.Framework;
using EPiServer.Framework.Initialization;

namespace ReindexTargetSite_AdminTool.AdminTools.FindIndexPlugin.Initialization
{
    [InitializableModule]
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class PluginRouteInitialization : IInitializableModule
    {
        public void Initialize(InitializationEngine context)
        {
             RouteTable.Routes.MapRoute(
             null,
             "custom-plugins/my-plugin",
             new { controller = "RebuildFindIndex", action = "Index" });
             RouteTable.Routes.MapRoute(
                 null,
                 "custom-plugins/my-plugin/initiate-rebuild-index",
                 new { controller = "RebuildFindIndex", action = "InitiateRebuildIndex" });
             RouteTable.Routes.MapRoute(
                 null,
                 "custom-plugins/my-plugin/get-message",
                 new { controller = "RebuildFindIndex", action = "GetMessage" });
        }

        public void Uninitialize(InitializationEngine context)
        {
            //Add uninitialization logic
        }
    }
}

That's it from a code point of view. Now you just need to login into you Episerver and go to Admin view and select the new plugin "Rebuild Find Index"

Now select the sites and click on the "Rebuild" button.

It will rebuild the indexes for the selected sites.

FYI - I will add the link once i upload this to Github.

I hope it helps

Thanks

Ravindra S. Rathore

Sep 22, 2019

Ha Bui
( By Ha Bui, 9/23/2019 5:11:58 AM)

Nice! Just a little concern about timeout ... because when content is huge then index time may be few hours.

Could we combine admin tool and job to resolve it?

// Ha Bui

Manoj Kumawat
( By Manoj Kumawat, 9/23/2019 5:53:29 AM)

Excellent work done @Ravindra!

@Ha Bui, I don't know if async call to job would still make it timeout? These should be running parallely in the background since job being in asynchronous mode.

Just curious

Son Do
( By Son Do, 9/23/2019 6:41:35 AM)

Next nice post @Ravindra :)

I agree with @Manoj, the async tasks did already so they would be fine :)

Ha Bui
( By Ha Bui, 9/23/2019 8:07:41 AM)

Ah ok, sorry @Ravindra I missed your async controller.

But we already have: EPiServer.Scheduler.IScheduledJobExecutor and you already have Admin Tool / Job and Job very suitable for this situation right?

AdminTool is UI for Job to get full options (multiple sites)

Job is worker running on demand

FYI!

@Manoj

@SonDo

( 9/23/2019 9:49:05 AM)

I also agree with Ha on this, if you're going to be handling updates on many different sites you'd be better moving this to a Job the same as the standard indexing job. This make the code not run on the main UI thread and also on the DXC if your scheduler is separated out from the website (as recommended for large sites) it's far better for performance/usage.

Also you can make your job

  • Stoppable
  • Resumable
  • Give updates to the user on progress properly (as it's designed to do). I'm sorry but personally I'm not keen on polling jQuery script, if you don't move it you should at least throw SignalR in so it's not polling.

Manoj Kumawat
( By Manoj Kumawat, 9/23/2019 11:33:38 AM)

I am little bit concerned of usage of schedule job here and curious at the same time. 

How would you tell a schedule job to index a particular site node? specially if it's falling in multi-site environment. Maybe I'm not aware with the recent updates but in my knowledge you cannot pass any argument to a schedule job.

@Scott, How do you tell a Job to index selected websites only? Please share your thoughts.

Best regards

( 9/23/2019 11:54:20 AM)

I'm not sure why you are concerned, the whole purpose of jobs is to execute long running pieces of code or code that's running updates. This is exactly why most of Episerver is written to do just that.

We have numerous jobs that augment the Episerver Find Index on our commerce builds. 

You're right, out of the box Jobs do not have their own configuration but they still look at configuration and settings. I personally would keep the admin tool where you setup which sites you want indexing then save that configuration to the database (entitiy framework or DDS) and then make the Job execute the code using saving configuration. 

You should never have any admin jobs running long running procesess on the main UI thread, this is a general design pattern for asp.net in General. Also Episerver has specific recommendations about how to setup the DXC for long running processes why segmenting them in to their own app services https://world.episerver.com/blogs/Sergey-Vorushilo/Dates/2017/12/scheduled-jobs-setup-in-dxc-service/.

This also means that for fault tolerance if Find indexling calls have issues (as Find can have more often than ideal) you can have this setup as a regular job as well to run this code. Otherwise you either have to wait and re-use the admin tool or just fall back to the standard find indexer.

Ravindra S. Rathore
( By Ravindra S. Rathore, 9/23/2019 3:28:25 PM)

Thanks, Manoj, Son Do, Ha Bui, and Scott for your feedback and comments.

I am running this job Asynchronously so it will not block the main thread and it will run the background.

Initially, I want to go with the schedule job approach but for this, I have to save the selected site data somewhere because it is not accepting the arguments so I decided not to go with that way but yes we can do that way as well.

Right now I am using this in DXC hosted environment and it is working perfectly and as you all know that, it is not uses the scheduling so it will not break if you separated out your scheduler service.

Thanks again all for the feedback because it is always good to see the new and different approaches to do the things.

Thanks

Ravindra S. Rathore

( 9/23/2019 3:43:55 PM)

What i was trying to get at was running the code on a separate thread. 

Asynchronous operations are not multithreaded by nature so was worried about this running on the main Thread however, I see you're using Task.Run (part of threading) which runs the code on a separate thread so that's fine.

I guess it's a design choice, I like to cleanly separate out any processess in to jobs so you can start, stop, restart them and have full visibility (using thing like scheduled job overview) of everything that's running. The link about scheduler separation was that if you ask Episerver to do that you can move all of this code to run in a separate app pool which can help with things such as Azure auto heal policies and resource usuage.

But if you're happy I'm happy.

Antti Alasvuo
( By Antti Alasvuo, 9/23/2019 7:36:11 PM)

Hi Ravindra, really nice to see people blogging and keeping the community active.

Scott and others have pointed out few concerns and I must point out couple of concerns too about the current code.

  • the job "state" is kept in static strings, Message and ExecutionCompleteMessage
    • now if another user who has acccess to this tool starts a new job, the jobs will both change the static strings which might lead to weird situations
    • also when another user comes to the index view (http GET), they will not know if there is a job already running or not
      • well the same applies to the original user too if they leave the page what currently has the polling they can never return to check the state of the job
    • InitiateRebuildIndex sets the static strings to null, so if someone had job executing and on server the ExecutionCompleteMessage was set but a new job started the value is set to null and the poller doesn't see that the job initially started should be completed - now it waits for the second job to complete
    • sharing state in static string members is really dangerous
  • unused local variable 'selectedSite' in method 'InitiateRebuildIndex'
  • minor thing about the StringBuilder usage (statusReport), you use the default constructor, which by default initializes the size to 16 characters, this leads to memory allocations every time the stringbuilder doesn't have enough space to append more characters to it
    • so it would be more efficient if you "pre-allocate" the assumed size
      • something like count of sites * the average length of the message (3 * 400 for example)
  • I would be very cautious when settings static properties like the "SiteDefinition.Current" because you really can't know how it behaves and affects other requests/threads (without peeking into the actual implementation)
    • there is simple fix for this need
    • if you look at the IContentIndexer interface the method you are using to reindex is marked obsolete (at least in version 11, can't check the Find documentation as there are no class libraries online for version 9 and above) and you should use the overload that takes SiteDefinition
    • so use the overload that takes the SiteDefinition and then there is no need for you to set the "SiteDefinition.Current"

We could have a discussion about the "fire-and-forget" abuse with Task.Run (the code doesn't wait or care about the returned task), yes it works, but application pool thread is reserved anyways (away from your incoming requests). If there is a dedicated "edit" server/instance like suggested for example in DXC documentation to run the scheduled jobs, I wouldn't see this much of an issue - as long as everyone understands the usage (as a side note the the underlying ReIndex implementation is creating multiple tasks and using Task.Factory.StartNew on those and then waiting for them to complete).

Hangfire might be an interesting solution to use here instead to execute the jobs. Reliable and you could fetch the status of the "fire-and-forget" job using the id of the created job.

Ravindra S. Rathore
( By Ravindra S. Rathore, 9/24/2019 4:01:09 AM)

Thanks, @Antti and all,

I will try to implement the same functionality using the Episerver schedule jobs whenever I have time.

Thanks all for the feedback

Regards

Ravindra S. Rathore

Son Do
( By Son Do, 9/27/2019 2:42:51 AM)

As far as we already know that the indexing job is huge and heavy. Episerver schedule job is a possible way but it's still in web context.

Actually, I would like to run the indexing job outside the web context, a console app, a window schedule job or Azure functions. I hold that idea for a long time but haven't try to execute this idea. I will try it when I have time :)

/Son Do

Please login to comment.