Presence updates and multiple application pools

In Lync development, as in many other areas of life, things get much more complicated when you have more than one application pool. You need to start thinking about scenarios that previously weren't possible. One important case your application needs to account for if it may be deployed across more than one pool of servers is receiving presence updates from different pools of Front End servers. You can often get away with ignoring this effect, but you risk running into strange and seemingly inexplicable behaviour once in a while. This post explains the issue and what you can do about it. The first concept that's important to this discussion is that each pool of application servers is associated with a single pool of Front End servers as its next hop, as shown in the diagram below. When an application server sends a SIP message, it will always go to that particular Front End pool as the first step.

Front Ends and app servers

In addition, every single Lync user in a Lync environment is "homed" on exactly one pool of Front End servers. This is the pool that user's Lync client communicates with when it signs in (registers) to Lync Server, and the pool where all SIP messages are sent first on their way to other endpoints. (There are some more details about how the client picks which specific server in the pool to sign in to, but those aren't really relevant here.) The diagram below illustrates this.

Front Ends with homed users

Where things get interesting is when a user changes his or her presence, and an application located on both app servers is subscribed to the presence of that user. Here's an illustration of what happens:

Front Ends and presence update

To sum up: the user sends a presence update to his/her home server. The app pool associated with that same Front End  pool receives the update directly. To get the update to the other application pool, which is associated with a different Front End pool, the home server needs to send it first to the other Front End pool. This extra hop generally introduces a bit of delay between when the first (nearer) application pool gets the update and when the second one gets the update. This delay can be lengthened if there is a lot of latency on the network link between the two pools.

If there are more than two Front End and application pools, then this process is repeated for the other pools as well.

This delay in when the application pools pick up the presence updates can be problematic if the different instances of the application are sharing state information, or if they need to coordinate in some way among themselves when taking action on the updates. The servers may receive the same update at different times, causing servers to have mismatched in-memory data about users, or causing presence updates to be stored out-of-order in a common data store.

Thankfully, each presence update has a time stamp built into the message, generated when the update is published, so that each server receiving the update can see the same time stamp and use it to ensure messages are handled in the right order across multiple servers. To see this time stamp, you need to dig a bit into the category data in the presence notification objects, rather than using the strongly-typed category classes like AggregatedPresenceState. But you can still use the strongly-typed classes to get the actual data once you've found the time stamp.

When handling presence updates from a RemotePresenceView object, you get a collection of RemotePresentityNotification objects in the event arguments. To get the time stamp, look at one of the objects in the Categories collection on the RemotePresentityNotification, and check the PublishTime property.

[csharp] if (notification.Categories.Count > 0) { DateTime publishTime = notification.Categories[0].PublishTime; } [/csharp]

All the category objects in a single notification should have the same publish time, so it doesn't make much difference which one you look at within a single notification.

If you store data from the presence updates somewhere, or pass it between instances of the application, it's a good idea to use this publish time rather than a time stamp generated by your application. This will help you avoid time synchronization or ordering issues like the ones described above.