I’m going to talk you through the challenges we faced migrating from a third-party chat to a custom XMPP-based messaging solution for our client, Forward Health, a UK-based messaging healthcare application development. This article will cover the reasons for migrating, our expectations versus the realities of implementation, and the challenge of building additional functionalities.
Where we started
Forward Health, our client, wanted to build a mobile communications application for healthcare workers in the UK, including chat functionality. As a startup, they wanted to show their working product quickly. At the same time, the messaging had to be reliable, robust and able to send sensitive patient data securely. To achieve this, we decided to use one of the available third-party solutions for chat functionality.
Chat functionality is not a trivial thing, especially when it is aimed to support the healthcare industry. As the app grew, we encountered more edge cases and some bugs on the library side that the third-party was unwilling to work on. Additionally, Forward Health wanted to add new features that weren’t supported by the third-party library. Switching to a custom solution was the next step.
That’s when we started working with MongooseIM. MIM is an open source solution based upon the well-established XMPP protocol. We worked with an external company Erlang Solutions Limited to set up our backend and provide support with implementing custom solutions.
At first, everything about messaging seemed different. Previously, we had all of our needs met by the SDK and its REST API. Now, using MongooseIM, we had to take some time to understand the nature of XMPP and implement our own SDK. It turned out that the “bare bones” XMPP server only passed stanzas (XML messages) between clients in real-time. Stanzas can be of different types, i.e. normal chat messages, presence, requests and responses. A vast variety of modules can be added to the server to, for example, store messages, and let the clients query them.
On the client side (Android, iOS) there were some low-level SDKs. Unfortunately, they were only acting as a layer that enabled communication with MongooseIM and some of its pluggable modules called XEPs (XMPP Extension Protocol responsible, among other things, for sending push notifications for every message). The whole architecture for message handling, storing and querying messages, had to be implemented by our team.
What came to our rescue was the third-party library that we had used previously. It had a very well-thought-through API, so we made our solution work in a similar way. We separated XMPP specific code into our internal SDK with the interface corresponding to one from the previous solution. This resulted in only a few changes in our application code after migration.
During the implementation of MongooseIM, we were surprised several times by elements we thought would be standard, but weren’t available to us, even by XEP.
Implementing key features of XMPP-based chat
You may think, as we did, that timestamps would be as simple as “I get a message, I display this on the UI with a timestamp.” Nope, not that easy. By default, the message stanzas don’t have a timestamp field. Fortunately for our team, XMPP is an easily extensible protocol. On the backend, we implemented a custom feature, adding a timestamp to every message that passed through the MongooseIM server. Then the recipient would have the timestamp attached to the message.
Why couldn’t a sender add a timestamp themselves? Well, we don’t know if they have the correct time set on their phone.
Why isn’t there any XEP for that? Maybe because XMPP is a real-time protocol, so theoretically every message sent is received right away.
EDIT: As Florian Schmaus pointed out: “There actually is one, although it can easily be missed because of its confusing name: XEP-0203: Delayed Delivery.” It adds a timestamp to a message only if its delivery is delayed. Otherwise, the message was sent just now.
When both users are logged into the application, they can send messages to each other in real-time. But what if one of them is offline? The quick answer is: messages have to be buffered on the backend. The offline messages feature handles this work and sends all buffered stanzas to the user once they log back in.
But then several questions arise:
- How long should these messages be buffered for?
- How many of them?
- Should they be re-sent just after logging back in? But it will flood the client with the messages, won’t it?
- What if a user only logs in, but doesn’t enter the chat with the new messages. Will they all be gone?
- What if a user is logged in on multiple devices?
It became apparent that the Offline Message feature was only able to send messages to the first device to come back online, and those messages would then be lost for all other devices. We decided to discard this feature, and store the messages on the XMPP backend in a different, persistent way.
Message Archive Management (MAM)
MAM is on-server storage for messages. When a client is logged in, they can query the server for messages. You can query by pages, you can query by dates. It’s flexible — you can even query for a page before or after a message with a specific ID, adding filters for messages from the exact conversation.
But here is the catch. Normal chat messages are stored wrapped inside MAM messages, which have their own unique IDs. When a user receives a chat message in a stream, it doesn’t contain the MAM ID. They have to query the MAM to get it.
Retrieving from MAM is a network request, which means it can take a relatively long time. When a user enters a chat, they want to see messages immediately. So we also need a local database.
When a user gets a message in a stream (an online message), we save it to the local database and show it to the user. That way, we display messages that arrive in real-time rapidly to the user.
Additionally, every time they enter the chat screen, we download all the messages from now to the newest MAM message stored in the local DB for that conversation and put them into a database, ignoring duplicates.
This is how we handle storing old messages. Also, we are sure that in the database there is a complete set of messages for a specific conversation between the first and the last message from MAM.
To keep track of the messages downloaded from MAM, we’ve added two properties to conversation entities:
- MAM id of the newest MAM message in the database
- MAM id of the oldest MAM message in the database
Handling shattered sets of MAM messages in a local database would be very problematic.
Additionally, having these two properties for every conversation allows us to store normal chat messages in the database while ignoring the wrapper — MAM message. And when the user enters the chat, we can show the latest messages from the database and in the background fetch the missing messages from MAM.
Every chat-based app needs a screen with a list of chats—a place where you can see names, last messages and an unread message count. There must be a solution to that!
Actually, there’s not… There is something called Roster — it can hold a list of users tagged as “friends.” Unfortunately, there’s no last message, nor unread message count attached to them. Sure, you can get the needed information from the backend in pieces. At first, we wanted to do it that way, but it would work slowly and be complicated to do. That’s when we began working with Erlang Solutions on the Inbox feature, which is also making its way to open source.
When a user connects to the XMPP backend, the app fetches their inbox, which contains all conversations of that user — both one-to-one and team chats. Each of them has the last message attached to it and a count of unread messages. The application saves the whole inbox to the local database. When a user is in the app, and a new message arrives, we update the inbox state locally. That way the app doesn’t need to fetch the inbox for every new message.
Some third-party chat solutions provide a high level of abstraction. This is ok if you want to create a simple chat application. By implementing our own XMPP-based solution in the Forward app, we were able to get far better low-level access, which made solving issues much easier. Sure, it took some time, but now we know that we can provide any custom feature to help doctors in the UK communicate in a secure and easy manner approved by the NHS.
Messaging is all about high performance, real-time communication. By switching to MIM we were able to optimise every part of the solution to improve speed, reliability, and ultimately trust. Currently, we have the entire code, so it’s easy to track them down. Also, we’re after the stabilisation phase and a number of reports connected to messaging have drastically decreased. Users are happy with being able to trust the platform.
Designing and writing our own SDK was a challenging task and we liked it. It was something different from simple applications where you need to fetch data from a server and show it on the screen. During implementation, we understood many design choices of the third-party library API that we used previously. Why? Because we’ve encountered the same issues.