SharePoint Options for Globally Distributed Environments


Introduction


Since I started working with SharePoint environments in global companies there has been a common complaint for remote users.  Performance.  Users in remote locations can feel disenfranchised by slow page load times and overall latency of the platform.  In reality, who wants to consume information from a corporate intranet that takes 15 seconds or more to paint a page?  There are several solutions to this issue, with different advantages and drawbacks, and depending on an individual companies unique needs different approaches should be considered.


Why is SharePoint Slow?

 
This is the question that I first asked myself when determining the best way to speed it up.  The framework, one of the greatest strengths of SharePoint is also its icicles heal.  SharePoint provides a robust framework that allows end users to modify pages and content though lists and web parts.  End users can create site collections and sites to rapidly respond to changing business needs.  This functionally would require expensive development cycles and time intensive testing efforts if done in a traditional web application.  However, this same framework makes even out of the box implementation quite heavy.  For example, let’s say we have a publishing page with six web parts on it.  One can basically think of this as loading seven different web pages, each part sends its own authentication request.  This authentication will happen in fractions of a second if the user it relatively close to the server, but over several thousand miles and across oceans each request response can drag out to over a second.  We have the page load time at over seven seconds before any content is sent.

Current State Assessment

Information needs to be gathered prior to settling on any solution.  Things listed below should be ascertained before a solution is determined.

SharePoint Farm Best Practices Evaluation

o   User activity and current architecture
o   Load balanced front ends
o   Load balanced Application Servers
o   When are Indexing jobs run
o   High Availability Evaluation
o   DR Evaluation

 

Document Evaluation

o   Size
o   Quantity
o   Location
o   Frequency of Change
o   Power Users
o   Type

How do users use SharePoint?
o   Publishers
o   Viewers
o   Collaboration

o   Mixture of above
  • If Mixture had usage been delineated by Publishing and Team sites
  • Has the content been separated out by
  • By Site Collection
  • By Database
  • By Web Application
  • By Farm
Current Performance Times – This needs to be repeated for several offices in different locations.
o   Publishing page load time cached
o   Publishing Page load not cached
o   Team page load cached
o   Team page load not cached
o   Document Uploads
  • 1 MB
  • 5 MB     
  • 10 MB
  • 20 MB
o   Document Uploads
  • 1 MB
  • 5 MB     
  • 10 MB
  • 20 MB
o   Create geographical heat map of each performance metric


Government data in foreign counties 
o   List of all counties that house government data
o   Privacy laws of each foreign county

Planning

Compile all information into a current state document and benchmark current performance metrics.  With the gathered information, the process of selecting the method of resolution can begin.

Possible Solutions


WAN acceleration


The first thing to know about any replication solution is that it is going to be more complex to manage and maintain than a single farm.  Therefore, the first question that I ask when dealing with a geo-distributed environment is ‘how dispersed is your workforce and what are users typically doing with SharePoint?’  If the company is geo-distributed, but has very few main offices a WAN acceleration solution should be considered.  Silver Peak and River Bed are a few WAN accelerators that I have seen work wonders.  However, they tend to be expensive and many companies may have hundreds of offices making this solution cost prohibitive.  Additionally they work by caching at the bit level, meaning that they are very good at acceleration files that don’t change much, like web pages and document editing.  However, they offer a limited performance improvement when users are mainly uploading and downloading new content.


 
 
Replication at the Object Level

There are several off the shelf third party solutions for object level replication.  These generally work by harnessing the native SharePoint event receivers and replicating objects on ‘On Item Added/Modified/Deleted’ events.  They can normally be set in several different modes immediate, differentials as well as one-way and two-way replication.  I would strongly advise extensive testing with this option as two-way replication as it tends to be somewhat shaky and can creates a multi-master situation that can be difficult to recover from without users lousing data or documents.  In one-way there can also be some issues especially when creating larger objects such as sites and site collections.  These drawbacks can make the solution intensive and costly to maintain, as well as possible user frustration.  Additionally, the software that needs to be purchased can be costly at around 5 – 10 thousand per front end per year.  If a typical farm has two or three front ends and there are three or four global farms licensing cost can run into the high five to low six figures relativity quickly.  That said, these solutions are getting better and should be considered and tested in the customers’ environment.

Additionally, SharePoint has a built in method for this type of replication called content deployment.  This can be problematic because it is a one way replication with few safeguards.  If a user updates a piece of content in a replicated site, the next time it is modified on the master side the content will over right the replicated file and users will louse changes.

 


Implementing log shipping, differentials or publication subscription at the SQL level can provide robust solution to the replication question.  These options work by having a SharePoint database as a master, then replicating to a ‘slave’ database that is in read-only mode.  The second SharePoint farm is able to connect to the read-only database and user can consume information from a local source.  This option works very well for application such as corporate intranets where there are relatively few contributors and the primary objective is disseminating information and content company wide.  It also avoids the multi-master situation, if a site gets out of date for some reason, ie a network or power outage, the entire database can be restored so the replication can resume.  The primary disadvantage is that it is exclusively one way.  This can be mitigated by an intelligently designed information architecture where site collections for certain regions are placed in specific databases and the replication is reversed for those specific collections.
 
But wait, your DBA tells you that with enterprise edition of SQL peer to peer (two way) replication at the database level can be achieved.  This is true, but sadly this option does not work in the instance of SharePoint.  Peer to peer SQL replication works by modifying each table row with a GUID (generated unique identifier.)  When this is done, SharePoint will throw an error because modifications to the content database are unsupported, even when done by another Microsoft product.  Trust me, I have tried.
 
 
 
 

Geographical Data Layout

These are solutions purposed by Microsoft are about where the data is actually stored.  They can be implanted in conjunction with any of the WAN acceleration and replication solutions above.  Microsoft has done a pretty good job of explaining these so I will just point these out and repost the Microsoft information.

Central Solution









Distributed Solution



Central with Regional Sites


  

Other Factors

RBS

Remote Blob Storage or RBS is a hot topic when dealing with content databases with the potential to grow very large.  It keeps the size of the database down by storing documents or Blobs (binary large objects) on a separate file system and creating a pointer to the location in the SharePoint content database.  The important thing to remember in this case is that database replication ceases to be a viable option in this case as the bulk of the data is no longer stored in the database.  Because object level replication is at the SharePoint level it can still be achieved if this is a selected option.

The Cloud


Why don’t we just put all this in the cloud and let someone else deal with this headache?  The cloud is not magical or all knowing; the servers still live somewhere, just not in your datacenter.  Certain types of cloud services (IaaS) could be used with replication, but this still does not negate the need for it to be managed internally.  For a more detailed explanation of this see my other post about the different flavors of the cloud.


Language Differences


Not everyone in the world speaks English, and with overseas business growing the need for multiple languages may be a consideration.  SharePoint does provide language packs that solve for these issues locally by how does one manage this on a replicated farm?  There are no easy answers for this in that the verbiage will still have to be translated by people.  However, from a SharePoint technology point of view the solution can be achieved.  Audience targeting allows users in specific groups (normally AD groups) to be placed into audiences so that an English version will show up to people in the US group, while a part containing the same message in Mandarin is hidden.  For people in the China group, the English would be hidden while the Mandarin is displayed.


URL and DNS


Most customers don’t want people using different URL’s or links to get to the corporate intranet or collaboration site.  How can we ensure that a link sent by email or posted on a site from someone in the US does not lead a European user back to the US site and negate the effects of the replication?  This can be achieved in a several ways.  If the countries exist in different sub-domains or forests the solution could be as simple as changing the IP that their lower specific domain controller resolves to.  If the company has a flat domain in a single forest using a BIG/IP device can send users in specific subnets to the local site.  This is a common method used for global anonymous sites like google or CNN.  In essence, your location can be determined by your IP address, search ‘IP and my location’ to realize that you can’t hide.

Specific Government Data Location Restrictions and Laws


If the client works with governments, not unlikely if it is a global company, data restrictions may come into play.  Certain countries will not allow government data to be hosted outside the country.  Due diligence must be performed so strategic farm locations can be found.
 

Search


Several different search solutions can be selected for geographically dispersed farms.  If WAN acceleration is selected there is no special configuration needed.  If all content is synchronized each farm will contain its own index so that content does not have to be searched over the WAN.  It only becomes a bit tricky if not all content is replicated, then the system must be set up to preserve the identity of the user so the results remain security trimmed.

Comments

Popular posts from this blog

Corporate Intranet Information Architecture – a Publishing Site

No Search Results in SharePoint Contextual Search OSSSearchResults.aspx

The long sad road to getting Metastorm, SharePoint, and Kerberos to work together in a multiple server farm.