Loading...
Area: Episerver Search & Navigation
Applies to versions: 13.2 and higher

Removing specific statistics

Recommendations [hide]

This topic explains how to use the APIs to remove data from statistics, and to prevent statistics that contain certain formats from being stored with Episerver Search & Navigation (formerly Episerver Find). 

How it works

You can use the APIs to remove statistical data and to prevent the addition of personally identifiable information (PII) to the search index. For example, you can prevent data that contains an at sign (@) from being indexed. 

This feature involves the following APIs.

Method Description Sample
StatisticsGetGDPR Search GDPR data from statistics. Use to get statistics items that match a query pattern.
An extension method of IStatisticsClient.
public static GDPRQueryResult StatisticGetGDPR(this IStatisticsClient client, string query)
StatisticsDeleteGDPR Delete GDPR data from statistics. Used to delete all statistics items that match the query pattern.
An extension method of IStatisticsClient.
public static GDPRDeleteResult StatisticDeleteGDPR(this IStatisticsClient client, string query)

In addition, you can use these APIs to prevent PII data from appearing in search query results.

ITrackSanitizerPatternRepository

Method Description Sample
Add Add patterns to storage which, in turn, are used to prevent PII data from search query.

string Add(TrackSanitizerPattern pattern)

void Add(IEnumerable<TrackSanitizerPattern> patterns)
Get Get pattern by Pattern Id from storage TrackSanitizerPattern Get(string patternId)
GetAll Get all patterns from storage IEnumerable<TrackSanitizerPattern> GetAll()
Update Update existing pattern(s)

string Update(TrackSanitizerPattern pattern)

void Update(IEnumerable<TrackSanitizerPattern> patterns)

Delete Delete pattern by Id from storage void Delete(string patternId)
DeleteAll Delete all patterns from storage void DeleteAll()

See also Preventing indexing of PII data.

Examples

To retrieve and remove PII data from statistics, you can search for a range of data:

  • Email including domain name, for example "john.doe@example.com"
  • Full name, for example "John Doe"

You can prevent sensitive data in a custom search query from being saved to the statistics using a predefined pattern.

public class Sample
  {
    protected IClient _client;
    protected IStatisticsClient _statisticsClient;
    protected ITrackSanitizerPatternRepository _trackSaniziterRepository;
    public Sample(IClient client)
      {
        _client = client;
        _trackSaniziterRepository = new DefaultTrackSanitizerRepository(_client);
      } 
    // Setting and add sanitizer patterns.
    _trackSaniziterRepository.Add(new List<TrackSanitizerPattern>
      {
        new TrackSanitizerPattern 
          { 
            PatternString = "admin", 
            PatternType = TrackSanitizerFilterType.PlainText 
          },
        new TrackSanitizerPattern 
          { 
            PatternString = "email",
            PatternType = TrackSanitizerFilterType.PlainText 
          },
        new TrackSanitizerPattern 
          { 
            PatternString = "*@mail.com", 
            PatternType = TrackSanitizerFilterType.Wildcard 
          },
        new TrackSanitizerPattern 
          {
            PatternString = "1#1", 
            PatternType = TrackSanitizerFilterType.Wildcard 
          },
        new TrackSanitizerPattern 
          { 
            PatternString = "c[a-e]ll", 
            PatternType = TrackSanitizerFilterType.Wildcard 
          },
        new TrackSanitizerPattern 
          { 
            PatternString = @"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*", 
            PatternType = TrackSanitizerFilterType.Regex 
          }
      });
 
    // Doing Tracking behavior
    var result = _client
      .UnifiedSearchFor(@"email admin admin@mail.com admin1@mail.com
        admin2@mail.com ball bell bill 121 131 141 call cell")
      .StatisticsTrack()
      .GetResult();
 
    // Try to get GDPR data by keyword matched sanitize pattern.
    var response = _statisticsClient.GetGDPR("@mail.com", x => { });
  };​

When using a wildcard pattern, review these Microsoft wildcard samples.

Search, filter, and delete PII data

Use statistics#Search_filter_delete, see Preventing indexing of PII data how to install a sample and verify the deletion.

Related topics

Do you find this information helpful? Please log in to provide feedback.

Last updated: Jun 17, 2019

Recommendations [hide]