Get Recent Changes across Site Collections

Summary: There’s no great way to get a list of recent changes across Site Collections.

Ok, perhaps that’s a bit overstated. It can be done. I’m just wishing there was a better way. Of course, it should be mentioned that if you’re not concerned about different site collections, there are a variety of methods available to query for a list of recent changes.

So, to be clear I’d like to see a list of items that have been changed recently across all site collections in a web app. Specifically:

  • Items should be security trimmed as normal for SharePoint (Users shouldn’t see things they don’t have access to)
  • Items should be listed in descending order based on modified date
  • Only the last x changes should be displayed
  • Changes to internal pages (anything in _layouts / _catalogs) should not be displayed
  • Performance should be excellent

 

There are two major methods that I’m aware of to do this, the first is querying against the change log.

Change Log

At first glance querying against the change log seems like a perfect fit. It is by definition a log of all changes. It even knows what type of change was made, such as whether the change was an insert, update, or delete. Following is an example of the using the change log, but before you copy it, please note that it doesn’t accomplish the above goals.


SPContentDatabase db = SPContext.Current.Site.ContentDatabase;
SPSite site = SPContext.Current.Site;
SPWeb web = SPContext.Current.Web;
SPListCollection lists = web.Lists;

SPRegionalSettings regionSettings = web.RegionalSettings;
SPTimeZone timeZone = regionSettings.TimeZone;

SPChangeToken token = 
    new SPChangeToken(SPChangeCollection.CollectionScope.ContentDB, db.Id, DateTime.Now - timespan);
//if the following are set to true, all types of changes would be retrieved. 
SPChangeQuery query = new SPChangeQuery(false, false);
            
//set that we want to see changes for items. other choices include alerts, fields, groups, etc.
query.Item = true;
//what kind of changes to retrieve            
query.Update = true;
query.Add = true;
query.Delete = false;

query.ChangeTokenStart = token;
query.FetchLimit = 10;
        
SPChangeCollection changes = web.GetChanges(query);

Note: the above does not accomplish the stated goal! We only have forward only access to the log. So, we can start reading at a certain point (specified via a DateTime), and read forward from there, but we can’t read just the last 10 items. So, the above sets a DateTime in the token, and reads the next 10 items (due to the fetch limit) from there. This introduces a problem, in that on Monday morning, the last change may have been three days ago, while at mid-morning, the last 10 changes were made within the last few minutes. Certainly, the code could just process through all changes, but that introduces performance issues. (If you want to head that direction, see the help doc, as there is a default fetch limit. You’ll have to set it to read the changes in batches.)

But, beyond performance, there’s a bigger problem: the above is set to query changes for a particular content database (Other choices are list, web, or site collection). This presents an interesting problem in that content databases are maintained by the admins. A web app that has one content database today may have two next week. So, getchanges gets us close, but there is simply no mechanism that I’m aware of for getting changes for a specified web app (or farm), without looping through available content databases.

 

FullTextSQLQuery

That leaves us with Search and FullTextSQLQuery. We can write a query that returns results, sorted descending by modified date. This means our “recent changes” will really be “recent changes as of the last crawl”. Not perfect, but better than nothing.

FullTextSqlQuery qry = new FullTextSqlQuery(SPContext.Current.Site);
            
qry.QueryText = "select title, path, Description, contentclass, ModifiedByCustom, LastModifiedTime from scope() ";
qry.QueryText += "where \"Scope\"='" + ScopeName + "' ";
qry.QueryText += "and lastmodifiedtime > '" + DateTime.Now.AddDays(-3).ToString("yyyy/MM/dd") + "' ";
qry.QueryText += "AND ModifiedByCustom > '' ";
qry.QueryText += "order by LastModifiedTime desc";

qry.ResultTypes = ResultType.RelevantResults;
qry.RowLimit = ItemCount;
ResultTableCollection results;
ResultTable resulttable;
DataTable dt = new DataTable();
try
{
    results = qry.Execute();
    resulttable = results[ResultType.RelevantResults];
    dt.Load(resulttable, LoadOption.OverwriteChanges);
}
catch (Exception ex)
{
    …
}

There are quite a few other sources that are available with info on how to use the FullTextSQLQuery, so following are just some notes on using it for the specific purpose of getting recent changes:

Custom Scope

The above query references a custom scope in the where statement. The custom scope serves two purposes: 1) narrow the search results to the desired web application, and 2) Only show changes to list items, documents, etc., not changes to list definitions and the like (This last requirement could have been done as a long ‘OR’ statement, but I chose to use a custom scope). Following are a portion of the rules that have been set up for the custom scope:

 

 

Note, the above doesn’t restrict items in the announcements list, for example, but it does mean that the allitems.apx page in the announcements list will not show up, for example. No, that’s not a complete list. For the complete list of options, see here:

http://blogs.msdn.com/b/jjameson/archive/2009/03/05/excluding-various-sharepoint-items-from-search-results-on-internet-facing-moss-sites.aspx

 

ModifiedBy

 

You’ll notice that the query does not use ModifiedBy, but instead uses “ModifiedByCustom”. I wish I had a better answer than this, but at this point I don’t: The query didn’t function on my server using the ModifiedBy field. So, I created a new metadata property named “ModifiedByCustom”, set it up with the same mappings and options as ModifiedBy, and it works fine. If you have any insight on this, please post a comment.

The query also uses ModifiedByCustom in the where statement. According to the docs on FullTextSQLQuery, I should have done the following:

…where ModifiedByCustom is not null

But, the above throws an exception. But, the following works fine:

…where ModifiedByCustom > ”

(Thanks to Waldek for that! More here: http://blog.mastykarz.nl/excluding-empty-text-values-sharepoint-search/)

But why filter items so that it only displays where modified by is not empty? If a page has an rss feed web part on it, when it is crawled it will be considered to have been modified as the feed content has changed. In this case, the ModifiedBy user will be empty. For my purposes, I only want to show changes that a user has made, and not these types of system changes (Other possibilities include pages for an access services app, pages with external lists, etc.).

 

 

 

So, with the FullTextSQLQuery, combined with a custom search scope, we can display recent changes to a site (as of the last crawl time). And, the results will be security trimmed. All that’s left is formatting the results.

 

 

 

 

 

 

 

 

Advertisements

One thought on “Get Recent Changes across Site Collections

  1. Pingback: Resources: SharePoint Development,Programming And PowerShell | lionadi

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s