Content Mirroring Configuration

Product version:

EPiServer CMS 4.62

Document version:

1.0

Document creation date:

07-06-2006

Document last saved:

04-10-2007

Purpose

This document contains a description of content mirroring and explains how to configure mirroring. The document also contains a test scenario and a detailed troubleshooting chapter.



Table of Contents

Prerequisites

Basic Concepts
    - Channel

Technical Description
    - Export/Mirroring at Initial Deployment
    - Mirroring Locking

Typical Mirroring Scenarios
    - Mirroring to Another EPiServer CMS 
    - Mirroring to HTML
    - Mirroring to XML

FAQ

Troubleshooting

Configuring and Testing Mirroring
    - Step 1 - Set Up Mirroring to Publish Content to a Remote EPiServer CMS Site
    - Step 2 - Update web.config on the Remote Web Site with SOAP Extension Information
    - Step 3 - Set Up Your Source Site as the Publishing Web Site
    - Test Scenario

Appendix
    - template.xsl
    - Mirroring Field Descriptions in Admin Mode



Prerequisites

The following is required to be able to configure mirroring correctly.

  • Both sites must run EPiServer CMS 4.50 or later.
  • Both sites must have page types with the same name, in this case. Both page types should share at least ”MainBody” so that there is a property to display externally.

It is also a good idea that you have the following installed.

  • Service Pack 1 to Windows 2003. There is an issue that was a problem for another mirroring case, and could be something that you are subject to as well. You can read more about it on the following kb-article, http://support.microsoft.com/kb/886461/en-us.
  • Service Pack 1 to .NET Framework 1.1. Why you need to make sure that is installed is because you had OutOfMemoryException problems. There is a bug that is fixed in SP1 that cause OutOfMemoryExceptions.

Basic Concepts

Channel

In EPiServer CMS, the information defining the content that should be mirrored from the Web site is defined in channels. You can have multiple channels defined in one EPiServer CMS site. Each channel contains properties that define the pages that should be included in a channel. A channel does not know or care about where the pages are being delivered, it only makes sure the underlying publisher receives information about the changes according to the channel properties.

Note  It is important to understand that the channel mirrors what it can see, not the actual pages. This means that access rights, filters, publish dates, etc. can be used to obtain a customized view of data.

Channel Type

One of the major properties of a channel is the channel type. There are three types of channel to choose from:

  • Tree - The tree model finds all changes made to a tree (move, delete, update, create). All these changes will be sent to all destinations. The tree model always finds differences by starting at the root page and recursing the tree. This means that a sub-page will never be included if the parent is not included. If you exclude a page (by using a filter), all sub-pages will also be excluded. This type is primarily used to be able to mirror a tree structure where the target will be a replica tree structure of the source.
  • List - The list model works exactly the same way as a tree model with the difference that it won’t recurse after the first level.
  • Search - The search model will only find changed (or "marked as changed") or new pages. Delete and move operations are not intercepted. This model is specifically designed to be able to mirror pages regardless of location to a single destination/list. This is useful for scenarios when news should be exported to another system (news service or another global news list in another EPiServer CMS system).

Scheduled Mirroring

New or modified content can be released either manually or automatically. This is defined in the channel properties. Content is manually released by the editor from the Action window.

The Mirroring service in Admin mode can be used to set up a scheduled job, so that content is automatically mirrored at, for example, a set time every day or week.

When changes are approved they are handed over to destinations as defined on the channel, these changes are queued in the database on each and every destination.

Technical Description

On the sender side, a "state" is kept that records which pages have been sent to a given destination. Whenever the mirroring job is run, the current site structure is extracted and then compared with the recorded "state" of the receiver site. This comparison leads to a number of operations being queued, such as a "publish", "move" and "delete". After the comparison and operations queuing, the recorded "state" of the receiver site is updated to reflect the updated situation.

The receiver side keeps a "mapping table", which records the mapping between the sender page IDs and the receiver site page IDs. So, when the receiver side receives a request from the sender to "publish", "move" or "delete" a page, it uses that sender page ID to look up the corresponding local page ID - mapping it in other words. This "mapping table" is built on-the-fly as mirroring requests are received by the Web service.

A "reset state" option is available on the sender side to clear the recorded "state", but this does not affect the receiver site. When resetting the state on the sender site, it is also a good idea to reset the receiver site at the same time. The best way to do this is currently to delete the pages in the receiver side and empty the Recycle Bin. This will ensure that the sender and receiver are in agreement. The "mapping table" will still be there, but all its targets have been deleted and it will thus clean itself up in the next mirroring.

Mirroring also picks up referenced files, and will begin a mirroring operation by sending these to the receiver using a Web service. Once that is completed, the actual page updates are sent. This will ensure that when pages are published, the relevant files will be there. It will also keep the size of each transfer down, since each file is sent separately.

Export/Mirroring at Initial Deployment

The first time content is mirrored from the sender to the receiver, the mirroring will run without error as the receiver side has no current "states". When mirroring initially starts, the sender side sees that no content has been mirrored and will send all content regardless.

At the receiver end, there is no mapping table either, so the receiver will pick up the pages and effectively build an identical parallel tree. This might not be what you want, but it should not be a problem, as you can just switch the start page to the newly mirrored tree and remove the imported one at your leisure. The initial mirroring requires a full overwrite on the target.

Mirroring Locking

Mirroring is not run as a single "transaction" and sender databases are not locked for read/write when a site is mirrored. There is a slight risk of short-term inconsistencies that will be resolved at the next mirroring operation, if editing is being done at the time of the mirroring, but there is no risk of long-term inconsistencies.

Example:  Consider a "delete". Imagine that you start comparing the sender site with the recorded "state" of the receiver and find a page as unchanged there. After that an editor deletes the page. That change will not be detected by that mirroring operation. There are other more complex scenarios, but generally any inconsistencies will be fixed by the next run.

Typical Mirroring Scenarios

This chapter describes some typical scenarios for when mirroring may be used.

Mirroring to Another EPiServer CMS

Mirroring to another EPiServer CMS can be summarized as an automated export/import between two Web sites. When the content is changed, it is packaged into an export package in EPiServer CMS and sent via a Web service to the other Web site, where it is unpacked and imported into the system.

There are, however, several differences between standard export/import in Admin mode.

  1. The receiver remembers which pages have been received for a certain channel and will make sure that the next time the same page is received, it will be updated instead of being re-created.
  2. Files and images will not be packaged and sent inside the export package. They are sent separately before the actual export package is sent.

Microsoft Web Service Extensions (WSE), which supports DIME, makes it more efficient to send binary files.

Mirroring to another EPiServer CMS could perhaps be used if you want to have one development / test environment that is mirrored to an external environment.

Mirroring to HTML

The pages will mirror HTML files by sending a Web request to the page's URL. This request downloads the content and stores it as HTML files in the local file system on the server. A tree structure in EPiServer CMS will in this way be mirrored to a tree structure in the file system, where a page becomes a folder. The function will also search all HTML that is downloaded, and search for references to images, links and style sheets.

Note  Links must be relative to the site for this to work; otherwise the links will be left untouched.

Mirroring to HTML takes longer than the other types of mirroring, as the Web server must be contacted for each page and the content downloaded. As changes to a single page can affect a large amount of pages, e.g. menus, there is a setting that controls that each update fetches all pages all over again and checks whether they are affected by the change.

Mirroring to XML

XML mirroring works in a similar way to HTML mirroring in that each page will mirror a file on the local file system. The only difference is that instead of downloading HTML from the page, the properties will be extracted and formatted to an XML file via an XML style sheet (XSLT).

FAQ

The mirroring process takes a very long time to run. RAM is not a problem, but the CPU was very busy. Is this normal?

Inserting a page is a very heavy operation in EPiServer CMS, so it is probably quite natural that the import takes as long as it does. This should not be an issue after the initial mirroring, as it is the insertion of pages that is heavy, and a typical mirroring operation is not at all comparable to importing/initially mirroring the whole site.

However, if you are still experiencing problems after the initial mirroring, it may be due to the fact that large objects (over 64k) are not always handled by the garbage collector of ASP.NET. Refer to the FAQ Performance issues and OutOfMemoryException.

When a site is being mirrored, is a read/write lock placed on the sender database?

Mirroring is not run as a single "transaction", so the sender database is not locked during mirroring. There is a slight risk of short-term inconsistencies that will be resolved at the next mirroring operation, if editing is being done at the time of the mirroring, but there is no risk of long-term inconsistencies.

What happens when a Web site visitor requests a page that is currently being updated on the receiver side?

If a visitor requests a page just as it is being updated by the mirroring receiver, the page the visitor sees will depend on the visitor's timing, i.e. either the new or the old page.

We want to break down the EPiServer CMS mirroring into the main sections of the Web site, but are worried that links between sections will break. Is there a setting to ensure that the links do not break?

It is possible to break mirroring of EPiServer CMS sites into smaller sections. The way to do this is by configuring several channels and selecting the "Allow receiver to fetch links from other channels" check box in the Destination window.

What happens when I export a file that is part of a page folder?

When exporting a file that is part of a page folder (or subfolder) all files in that folder (or subfolder) will be exported.

Troubleshooting

Error: Server found request content type to be 'application/dime', but expected 'text/xml'

Make sure that you added the Web service extension in web.config on the destination server as described in the instructions above.

Error: Found a high surrogate char without a following low surrogate. The input may not be in this encoding, or may not contain valid Unicode (UTF-16) characters

Make sure that you added the Web service extension in web.config on the destination server as described in the instructions above.

Error: Object moved to here / Access denied

Make sure that the Web Service user has access to log on to the server. For more troubleshooting and configuration options, please refer to the “Web Services” technical note.


Error: System.Web.HttpException: Maximum request length exceeded

Maximum request length exceeded when exporting large amounts of information.

This problem is solved by changing certain settings in the system. Make sure that both httpRuntime and maxRequestLength are set in the receiving mirroring site. Change the maxRequestLength to 40960 KB (40 MB).

Example:

<httpRuntime maxRequestLength=”40960” />

<configuration>

       <microsoft.web.services2>

       <messaging>

       <maxRequestLength>40960</maxRequestLength> //kilobyte

       </messaging>

       </microsoft.web.services2>


Error: Timeout exceeded

To solve the problem of an exceeded timeout, increase the timeout value in Internet Information Service (IIS). It may also be necessary to increase the timeout value for the receiving site. See http://msdn.microsoft.com/library/en-us/wse/html/940ecc18-25ce-45d8-b040-408d931d9fe1.asp?frame=true for further information.

Error: Communication was signed but checksum was not valid

This error message can occur for two reasons:

  1. The code in the "Shared Secret for Signature" field for the receiving site (Admin mode – Remote Web Sites) is not the same as the code in the publishing site. Make sure that you create the code in the receiving site and copy it into the publishing site and not vice versa.
  2. The local site names stated for the publishing and receiving sites in Remote Web Sites (Admin mode) do not correspond with each other. To solve this, make sure that the name in the "Local site name" field in Remote Web Sites is the same as the "Name" field in the Edit Remote Web Site window.

Configuring and Testing Mirroring

The following instructions tell you how to setup a remote site and your source site so that you can publish content from your source site to the remote site.

Step 1 - Set Up Mirroring to Publish Content to a Remote EPiServer CMS Site

  1. Install a new site and name it RemoteSite.
  2. This step only applies if your extranet site is using Forms authentication.
    Set up the new site with basic authentication. Make sure that the IIS directory security is set to Basic authentication AND that anonymous access is NOT allowed.
  3. Note By default the user needs to be a Windows administrator on the server that runs the extranet site.

    1. Open Internet Information Services Manager on the Web server and select the /WebServices folder on your remote EPiServer CMS Web site.
    2. Right-click and select Properties. Under the Directory Security tab, click Edit.
    3. The authentication options must be configured for Basic Authentication only. Otherwise automatic authentication will not occur.
    4. Edit the Web configuration file, web.config, in the EPiServer root directory. Make sure that the Web service account is allowed access in the WebServices folder.
      Replace DOMAIN with the name of the domain or local machine where the user account was created.

      <location path="WebServices">

          <system.web>

            <authorization>

              <allow users="DOMAIN\MyWebServiceUser" />

              <deny users="*" />

            </authorization>

          </system.web>

        </location>


    5. The BasicAuthentication http module will translate basic authentication requests on-the-fly to forms-authenticated cookies. Make sure that web.config has the BasicAuthentication filter defined under the httpModules section.

      <httpModules>

      <add name="BasicAuthentication" type="EPiServer.Security.BasicAuthentication, EPiServer" />


    6. Test the setup by opening a Web browser and entering the URL to a Web service on your Web site, for example: http://localhost/RemoteSite/WebServices/DataFactoryService.asmx. You will receive a standard Windows login pop-up window.
    7. Enter the WebServiceUser account information. If everything is working, you should see the Web Service definition page
  4. Go to Admin mode and click the Config tab and then Remote Web Sites.

  5. Edit the local site name, enter "RemoteSite" in the field and click Save.

  6. Click Create. Enter the name of your source site in the Name field. Enter “http://localhost/SourceSite” in the URL field. (Replace SourceSite with the name of your source site.) In the Shared secret for signature field, click Create to the right of the field and verify that a key is generated in the field. Leave the username, password and domain fields and check boxes empty. Click Save.

Step 2 - Update web.config on the Remote Web Site with SOAP Extension Information

Add the following under the <system.web> section in web.config on the RemoteSite:

<webServices>

  <soapExtensionTypes>

      <add type="Microsoft.Web.Services2.WebServicesExtension, Microsoft.Web.Services2, Culture=neutral, PublicKeyToken=31bf3856ad364e35" priority="1" group="0" />

  </soapExtensionTypes>

</webServices> 


Step 3 - Set Up Your Source Site as the Publishing Web Site

(Replace all the occurrences of SourceSite in the instructions below with the name of your source site.)

  1. In your source site, go to Admin mode, click the Config tab and then Remote Web Sites.
  2. Edit the local site name, enter "SourceSite" in the field and save.
  3. Click Create. Enter "RemoteSite" in the Name field. Enter http://localhost/RemoteSite” in the URL field. In the Shared secret for signature field, copy and paste the key that was created when you created the remote site in the previous step.
  4. Enter a valid Web service user, password, and domain. This is the same Web service user you entered in the previous step.
  5. Click Save. Click Ping and verify that the connection between the source (publishing) and remote sites works.

Test Scenario

Make sure that you have followed the instructions in the previous chapter regarding configuration of content mirroring.

Mirroring can also be used to publish content to HTML and XML. This document initially contains information on how to do this, but this information will later be included in a separate document.

Create a Channel and Destination in Your Source Site
  1. In the source site, go to Admin mode, click the Config tab and then Mirroring Administration.
  2. Click Create. Enter SourceChannel in the Name field. Choose a page, from where you wish to publish the tree structure (It should have children).
  3. Choose "Tree" in the Mirror Type box.
  4. Select Include the start page. Click Save.
  5. Click Create Destination. Select “EPiServer” in the Select destination type box. Click OK.
  6. Enter EPiServerDest in the name field. Choose your second EPiServer site in the Remote site box. Choose a page at the remote site and enter the page’s ID in the Root page on destination field. Click Save.
Publish a Page to the Remote EPiServer CMS Site
  1. In the source site, edit the page that you selected as the publishing start page above. Save and publish the page.
  2. Open the Action Window and click Approve mirroring updates. A list of updated channels appears and “SourceChannel” is listed with the amount of updated pages in parentheses. Click SourceChannel.
  3. The currently updated pages are listed and a Publish button appears at the bottom. Click Publish.
  4. Go to Mirroring administration on the Config tab in Admin mode. A list of queued jobs is listed under the channel Queue Length (If the scheduled service already executed it will say 0).
  5. If the scheduled service does not run, click Mirroring Service under the Admin tab and click Start Manually. Check that the pages were published on the remote site.
Publish a Page to HTML
  1. In the source site, go to Admin mode, click the Config tab and then Mirroring Administration.
  2. Click SourceChannel and then Create Destination.
  3. Select “HTML” as the destination type and click OK.
  4. Enter EPiServerHTML in the name field.
  5. Create a directory “C:\episerverhtml” in your file system. Enter C:\episerverhtml in the Target Directory on the server” field. Enter /episerverhtml/ in the Relative root path field if you are to run the remote site from your hard drive. Click Save.
  6. Create a new page in the source site, publish it, and approve the mirroring updates in the Action Window. If the scheduled service does not run, click Mirroring Service under the Admin tab and click Start Manually.
  7. Verify that the pages have been written to “C:\episerverhtml”.
Publish a page to XML
  1. In the source site, go to Admin mode, click the Config tab and then Mirroring Administration.
  2. Click SourceChannel and then Create Destination.
  3. Select “XML” as the destination type and click OK.
  4. Enter EPiServerXML in the name field.
  5. Create a directory “C:\episerverxml” in your file system. Enter C:\episerverxml in the Target Directory on the server field.
  6. Create a file called template.xsl under C:/episerverxml and fill it with the text in the "template.xsl" chapter of the Appendix. (The demo template.xsl is only a basic XSL example.)
  7. Enter C:\episerverxml\template.xsl in the Path to XSL template field. Click Save.
  8. Create a new page in the source site, publish it, and approve the mirroring updates in the Action Window. If the scheduled service does not run, click Mirroring Service under the Admin tab and click Start Manually.
  9. Verify that the pages have been written to “C:\episerverxml”.

Appendix

template.xsl

<!--

      - XSLT is a template based language to transform Xml documents

      It uses XPath to select specific nodes

      for processing.

     

      - A XSLT file is a well formed Xml document

-->

<!-- every StyleSheet starts with this tag -->

<xsl:stylesheet

      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

      version="1.0">

<!-- indicates what our output type is going to be -->

<xsl:output method="xml" />        

     

      <!--

            Main template to kick off processing our Sample.xml

            From here on we use a simple XPath selection query to

            get to our data.

      -->

      <xsl:template match="/">

            <Page>

                  <Title><xsl:value-of select="/page/properties/property[@name='PageName']/text()"/></Title>

                  <Body><xsl:value-of select="/page/properties/property[@name='MainBody']/text()"/></Body>

            </Page>                

      </xsl:template>

</xsl:stylesheet>

 

Mirroring Field Descriptions in Admin Mode

 

Mirroring Administration Window

Field

Description

Channel Name

Lists all the channels that are available for the site. Click a channel name to display the Destination Overview window.

Create (button)

Click Create to create a new channel. This opens the Mirroring Settings window.

 

Mirroring Settings Window

This window is displayed when you create or edit a channel.

Field

Description

Information tab

Name

Name of the channel.

Start page

Select a page from where you want to publish the structure. The page must contain pages or sub-folders.

Mirror type

Select the type of mirroring to be done. See the "Basic Concepts" chapter for further information on the different mirroring types.

Globalization support

Applies to globalized Web sites. Select whether you want to mirror the only the original language, all languages or another language.

Approve changes automatically

Select this check box if you want any changes to be approved automatically instead of approving them manually in the Action Window in Edit mode. Any changed pages will therefore be updated on the receiving site the next time the mirroring service is run.

Include the start page

Select this check box if you also want the changes to apply to the start page.

Run as anonymous user

Select this check box if you want to run the mirroring job as an anonymous user. If not, enter a username, password and domain.

Property Filter tab

Activate filter

Select this check box if you want to activate the filter settings.

Filter by property name

Enter a property name, i.e. WriterName, to only mirror pages that include that property.

Filter by property value

Enter a property value for the stated property name. For example, if you enter property name WriterName and property value Charlie, only pages that include the value Charlie in the Writer field will be included in the mirrored site.

 

Destination Overview Window

This window displays an overview of the defined destinations for this channel

Field

Description

Destinations

Lists the destinations that have been created for the channel.

Last status

Displays the status of the most recent mirroring execution.

Last execution

Displays the date of the most recent mirroring execution.

Queue length

Displays how many jobs will be run the next time mirroring is executed.

Edit Queue (button)

Opens the Mirroring Queue for Destination window from where you can delete pages and packages that should not be included when mirroring the site.

Mappings (button)

Opens the Mirroring Mappings window from where you can delete individual mappings or all mappings.

Settings (button)

Opens the Mirroring Settings window.

Reset State (button)

Click Reset State on the sender side to clear the recorded "state". This will not affect the receiving site.
You may want to reset the receiver site as well when you do this. This is currently the best way is to delete the pages, and empty the Recycle Bin. This will ensure that the sender and receiver are in agreement.

 

Destination Window

This window varies depending on the destination of the mirrored site:

  • Mirror to EPiServer CMS
  • Mirror to HTML
  • Mirror to XML

Mirror to EPiServer CMS

 

Field

Description

Select destination type

EPiServer

Information tab

Name

Enter a name for the destination.

Remote site

Select a receiving site from the drop-down list.

Root page on destination

Enter the page ID of the root page on the receiving site.

Publish pages

Select this check box if you want to publish the pages automatically on the receiving site. If you leave this check box empty, the mirrored pages must be published manually on the receiving site.

Allow receiver to fetch links from other channels

Select this check box if you want to be able to mirror content between different channels.

Queue tab

This tab is only available if the changed pages to be mirrored have been approved. This tab is for information only.

Queue number

Lists the numbers of the queues to be included in the next mirroring execution.

Item created

Date when the pages to be mirrored were changed.

Mirror to HTML

Field

Description

Select destination type

HTML

Information tab

Name

Enter a name for the destination.

Target directory on the server

Create a directory in your file system where you want the HTML pages to be published. Enter the name of the directory in this field, e.g. C:\episerverhtml.

Relative root path

Enter a prefix to be used for all the links. For example, enter "/episerverhtml/" in this field if you have selected C:\episerverhtml as the root directory and you are to run the remote site from your hard drive.

Default name for files

Change this value if you want your HTML files to have an alternative name as default.

Use the following property for folder names

This field defines the property name that control the name of the folder after mirroring to HTML.

Verify check sum for all pages every time

If this check box is selected, all the pages are downloaded every time a change is discovered on a page. One page can change many pages in listings, site maps, etc. Mirroring is speeded up if you do not need this function.

Do not include file name in links

Select this check box if you will be running the HTML site from a CD or hard drive and want to include the file name, e.g. Default.htm, in links. If you will be running the site online, it is usually preferable to only use the folder name for links and configure Default.htm as the default document in the Internet Information Services (IIS).

Apply channel filter

It is possible to require that certain properties apply to certain channels. Select this check box if you want to activate that listings and the menu tree are filtered in the same way when HTML is downloaded from a page.

Queue tab

This tab is only available if the changed pages to be mirrored have been approved. This tab is for information only.

Queue number

Lists the numbers of the queues to be included in the next mirroring execution.

Item created

Date when the pages to be mirrored were changed.


Mirror to XML

Field

Description

Select destination type

XML

Information tab

Name

Enter a name for the destination.

Target directory on the server

Create a directory in your file system where you want the XML pages to be published. Enter the name of the directory in this field, e.g. C:\episerverxml.

Path to XSL template

Create a file called template.xsl under your target directory. Fill the file with relevant text. (An example template.xsl can be found in this document.) Enter the path to your XSL file in the Path to XSL template field, e.g. C:\episerverxml\template.xsl.

Relative root path

Enter a prefix to be used for all the links. For example, enter "/episerverxml/" in this field if you have selected C:\episerverhtml as the root directory and you are to run the remote site from your hard drive.

Default name for files

Change this value if you want your HTML files to have an alternative name as default.

Use the following property for folder names

This field defines the property name that control the name of the folder after mirroring.

Do not include file name in links

Select this check box if you will be running the HTML site from a CD or hard drive and want to include the file name, e.g. Default.htm, in links. If you will be running the site online, it is usually preferable to only use the folder name for links and configure Default.htm as the default document in the Internet Information Services (IIS).

Queue tab

This tab is only available if the changed pages to be mirrored have been approved. This tab is for information only.

Queue number

Lists the numbers of the queues to be included in the next mirroring execution.

Item created

Date when the pages to be mirrored were changed.