Things I learned integrating with Apple Push Notification Service

Update:

Portland, OR based Urban Airship built a thriving business around APN since I wrote this post and added a ton of value, at this point I recommend checking them out rather than building your own integration.

Apple’s push notification service (APN), is a method for you to notify your app’s users of something time sensitive when your application is not actively running on their device. Because we’re still waiting on Apple for the ability run Apps in the background, it’s sometimes the best way to engage with user’s based on time sensitive or important server side events.

In order to get this integration going, you need to provision your application with the proper certificates, generate an SSL key, request and store an APN device token from the user’s device, and finally, connect to Apple’s servers via socket to send your alerts. There are a ton of great tutorials out there about how to set this up so I’m not going to rehash the process in too much detail. Though I will talk about one of the more elusive hiccups I ran into while building a server side APN system that I was not anticipating or find reference to in the developer documentation.
Feel free to skip to the last paragraph for the pointers if you don’t want the long winded description of how the problem came to be.

When our dev team here at FanFeedr was planning the architecture of our APN integration, we initially decided to add support for multiple devices per user. This being months before the iPad was announced, we had none-the-less figured one day Apple might release a companion device to the iPhone of some sort and having the ability to push to all a user’s devices seemed to make sense. So we created a user_devices database table on the back-end that would associate a user account with one or more device tokens. This seeming innocuous decision would come back to haunt us later.

Now when you hear a term like device token, you might assume that it’s concretely tied to the device. It’s an easy assumption to make but you would be incorrect. There is another string called the device identifier code which is in fact unique to every device. I had assumed incorrectly that the device token was unique and would not change over time. Lesson number 1, be careful when making assumptions especially when integrating with a third party service.

APN device token requests are disabled in Simulator so there’s no problem there. On the other hand, when you’re debugging on an actual device, everything works as you would expect. The thing is, the device token changes based on whether you’ve compiled the App in debug mode, or running a signed and released version of the app. When I would compile a debug version of the App against our production servers, usually in preparation for a release, I was unwittingly creating an additional “debug” token in our user_devices table under my account. Since I test on both an iPhone and an iPod Touch I expected multiple device tokens, not realizing that our backend server’s were actually storing multiple tokens for my main device.

So what’s the big deal? The problem sprung up several days after pushing the feature live. Our system which worked flawlessly during testing started becoming unreliable. We noticed certain devices were not getting notifications while others would. On occasion the working devices would break and the “broken” devices would suddenly start receiving notifications; the behavior was completely unpredictable… initially.

Apple’s documentation states that you should attempt to send notifications in batches to minimize the overhead of creating and destroying socket connections for each notification. During the first week things ramped up slowly and we were sending an average of about 5 notifications each cycle. Since the user_devices table was filling up with real customer tokens which outnumbered our development tokens, the system worked fine most of the time, but most of the time isn’t good enough.

After many hours spent debugging the system by writing test cases and running notifications by hand, I realized the problem. As soon as Apple received one of our development keys on the production socket, they would kill the connection. On the server we never noticed the closed socket because our code would exit normally without an exception. Everything looked like it was working properly from our side, mainly because sending notifications to Apple’s APN service is a bit like doing this. What hid the problem for so long was that any message would go through as long as it came before the “bad” token. Thankfully that behavior is what eventually helped me find the issue and correlate it with the offending token(s).

So here’s the first thing to keep in mind when you build a system of your own:

If you send Apple an invalid device token, they may drop your connection and you might not even realize it. You can avoid this particular problem by keeping your production and development device tokens separate and never mix them in the same database table! Treat the tokens as if they’re perishable and could expire daily and check Apple’s feedback service often.

For the sake of completeness here’s a list of other things I think it’s helpful to check if you’re having trouble integrating with APNS.

  • Notifications may not work reliably on a jail broken device.
  • Double check that you concatenated your certificate and key properly into your .pem file.
  • Make sure the user that connects to Apple’s servers has permission to access the SSL key file.
  • If the SSL key file has a password, be sure to set it in the code.
  • Double check you can connect to Apple’s server from your machine with Telnet: `telnet gateway.sandbox.push.apple.com 2195`

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s