SharePoint Options for Globally Distributed Environments
Introduction
Since I started working with SharePoint environments in
global companies there has been a common complaint for remote users.
Performance. Users in remote locations can feel disenfranchised by slow
page load times and overall latency of the platform. In reality, who
wants to consume information from a corporate intranet that takes 15 seconds or
more to paint a page? There are several solutions to this issue, with
different advantages and drawbacks, and depending on an individual companies
unique needs different approaches should be considered.
Why is SharePoint Slow?
Current State Assessment
Information needs to be gathered prior to settling on any
solution. Things listed below should be ascertained
before a solution is determined.
SharePoint Farm Best Practices Evaluation
o
User activity and current architectureo Load balanced front ends
o Load balanced Application Servers
o When are Indexing jobs run
o High Availability Evaluation
o DR Evaluation
Document Evaluation
o
Sizeo Quantity
o Location
o Frequency of Change
o Power Users
o Type
How do users use SharePoint?
o Publishers
o Viewers
o Collaboration
o Mixture of above
-
If Mixture had usage been delineated by Publishing and Team sites
- Has the content been separated out by
- By Site Collection
- By Database
- By Web Application
- By Farm
o Publishing page load time cached
o Publishing Page load not cached
o Team page load cached
o Team page load not cached
o Document Uploads
-
1 MB
- 5 MB
- 10 MB
- 20 MB
-
1 MB
- 5 MB
- 10 MB
- 20 MB
Government data in foreign counties
o List of all counties that house government data
o Privacy laws of each foreign county
Planning
Compile all information into a current state document and
benchmark current performance metrics.
With the gathered information, the process of selecting the method of
resolution can begin.
Possible Solutions
WAN acceleration
The first thing to know about any replication solution is
that it is going to be more complex to manage and maintain than a single
farm. Therefore, the first question that I ask when dealing with a
geo-distributed environment is ‘how dispersed is your workforce and what are
users typically doing with SharePoint?’ If the company is
geo-distributed, but has very few main offices a WAN acceleration solution
should be considered. Silver Peak and River Bed are a few WAN
accelerators that I have seen work wonders. However, they tend to be
expensive and many companies may have hundreds of offices making this solution
cost prohibitive. Additionally they work by caching at the bit level,
meaning that they are very good at acceleration files that don’t change much,
like web pages and document editing. However, they offer a limited performance
improvement when users are mainly uploading and downloading new content.
There are several off the shelf third party solutions for
object level replication. These generally
work by harnessing the native SharePoint event receivers and replicating
objects on ‘On Item Added/Modified/Deleted’ events. They can normally be set in several different
modes immediate, differentials as well as one-way and two-way replication. I would strongly advise extensive testing
with this option as two-way replication as it tends to be somewhat shaky and
can creates a multi-master situation that can be difficult to recover from
without users lousing data or documents.
In one-way there can also be some issues especially when creating larger
objects such as sites and site collections.
These drawbacks can make the solution intensive and costly to maintain,
as well as possible user frustration.
Additionally, the software that needs to be purchased can be costly at
around 5 – 10 thousand per front end per year.
If a typical farm has two or three front ends and there are three or
four global farms licensing cost can run into the high five to low six figures
relativity quickly. That said, these
solutions are getting better and should be considered and tested in the customers’
environment.
Additionally, SharePoint has a built in method for this type
of replication called content deployment.
This can be problematic because it is a one way replication with few
safeguards. If a user updates a piece of
content in a replicated site, the next time it is modified on the master side
the content will over right the replicated file and users will louse changes.
Implementing log shipping, differentials or publication
subscription at the SQL level can provide robust solution to the replication
question. These options work by having a
SharePoint database as a master, then replicating to a ‘slave’ database that is
in read-only mode. The second SharePoint
farm is able to connect to the read-only database and user can consume
information from a local source. This
option works very well for application such as corporate intranets where there
are relatively few contributors and the primary objective is disseminating
information and content company wide. It
also avoids the multi-master situation, if a site gets out of date for some
reason, ie a network or power outage, the entire database can be restored so
the replication can resume. The primary
disadvantage is that it is exclusively one way.
This can be mitigated by an intelligently designed information
architecture where site collections for certain regions are placed in specific
databases and the replication is reversed for those specific collections.
But wait, your DBA tells you that with enterprise edition of
SQL peer to peer (two way) replication at the database level can be
achieved. This is true, but sadly this
option does not work in the instance of SharePoint. Peer to peer SQL replication works by modifying
each table row with a GUID (generated unique identifier.) When this is done, SharePoint will throw an
error because modifications to the content database are unsupported, even when
done by another Microsoft product. Trust
me, I have tried.
Geographical Data Layout
These are solutions purposed by Microsoft are about where
the data is actually stored. They can be
implanted in conjunction with any of the WAN acceleration and replication
solutions above. Microsoft has done a
pretty good job of explaining these so I will just point these out and repost
the Microsoft information.
Central Solution
Distributed Solution
Central with Regional Sites
Other Factors
RBS
The Cloud
Language Differences
URL and DNS
Specific Government Data Location Restrictions and Laws
Search
Central with Regional Sites
Other Factors
RBS
Remote Blob Storage or RBS is a hot topic when dealing with
content databases with the potential to grow very large. It keeps the size of the database down by
storing documents or Blobs (binary large objects) on a separate file system and
creating a pointer to the location in the SharePoint content database. The important thing to remember in this case
is that database replication ceases to be a viable option in this case as the
bulk of the data is no longer stored in the database. Because object level replication is at the
SharePoint level it can still be achieved if this is a selected option.
The Cloud
Why don’t we just put all this in the cloud and let someone
else deal with this headache? The cloud
is not magical or all knowing; the servers still live somewhere, just not in
your datacenter. Certain types of cloud
services (IaaS) could be used with replication, but this still does not negate
the need for it to be managed internally.
For a more detailed explanation of this see my other post about the
different flavors of the cloud.
Language Differences
Not everyone in the world speaks English, and with overseas
business growing the need for multiple languages may be a consideration. SharePoint does provide language packs that
solve for these issues locally by how does one manage this on a replicated
farm? There are no easy answers for this
in that the verbiage will still have to be translated by people. However, from a SharePoint technology point
of view the solution can be achieved.
Audience targeting allows users in specific groups (normally AD groups)
to be placed into audiences so that an English version will show up to people
in the US group, while a part containing the same message in Mandarin is
hidden. For people in the China group, the
English would be hidden while the Mandarin is displayed.
URL and DNS
Most customers don’t want people using different URL’s or
links to get to the corporate intranet or collaboration site. How can we ensure that a link sent by email
or posted on a site from someone in the US does not lead a European user back
to the US site and negate the effects of the replication? This can be achieved in a several ways. If the countries exist in different
sub-domains or forests the solution could be as simple as changing the IP that
their lower specific domain controller resolves to. If the company has a flat domain in a single
forest using a BIG/IP device can send users in specific subnets to the local site. This is a common method used for global anonymous
sites like google or CNN. In essence,
your location can be determined by your IP address, search ‘IP and my location’
to realize that you can’t hide.
Specific Government Data Location Restrictions and Laws
If the client works with governments, not unlikely if it is
a global company, data restrictions may come into play. Certain countries will not allow government
data to be hosted outside the country.
Due diligence must be performed so strategic farm locations can be
found.
Search
Several different search solutions can be selected for geographically
dispersed farms. If WAN acceleration is
selected there is no special configuration needed. If all content is synchronized each farm will
contain its own index so that content does not have to be searched over the WAN. It only becomes a bit tricky if not all content
is replicated, then the system must be set up to preserve the identity of the
user so the results remain security trimmed.
Comments
Post a Comment