Authors: Fernando Rodríguez Sela, Guillermo Lopez Leal, Ignacio Eliseo Barandalla Torregrosa
Today mobile applications retrieve asynchronously information from multiple sites. Developers have two ways to retrieve this information:
- Poll: Periodically query for information from the server.
- Push: The server sends the information to the client when the required information is available.
The first method is strongly discouraged due to the large number of connections made to the server needlessly, because information is not available and you lose time and resources.
This is why the Push methods are widely used for information retrieval. Anyway, how Push platforms are currently developed are misusing mobile radio resources and also consuming too much battery.
This article aims to explain how to manage this kind of messaging, problems with existing solutions and finally how Telefónica Digital, within the framework of the development of Firefox OS operating system, ideated a new solution designed to be friendly to the network and use few battery on mobile terminals.
State of the art
Historically, mobile operators offered (and offer) real mechanisms of Push notifications, also known as WAP PUSH. This technology can “wake up” applications when any action is required by them from the server side (without interaction from the user). Sending WAP PUSH messages is done in the circuit-switching domain, the same used for voice and SMS, and that is why the user does not have the need to establish a data connection. These kind of messages work properly out of the box.
WAP PUSH solutions works great when the user is registered in the mobile network, but if you are out of coverage or connected to a WiFi hotspot instead of a celular network, you can not receive the messages.
Also, if we add that sending messages imply an economic cost (basically it is a normal SMS) the effect is that major smartphone operating systems (Apple iOS and Google Android) have implemented a parallel solution that would work regardless of mobile networks to which the user belongs and it can run smoothly when they are using WiFi networks.
Current solutions issues
These alternative solutions are based on a public server that is accesible from the Internet, from which all Push messages are canalized for the mobile devices and act as a intermediate between all platforms between third party app servers and the devices.
These new platforms of Push messaging had a big problem to solve, because usually mobile devices belonging to users are inside private networks (not accessible from the Internet) and their communications are managed by NAT servers and firewalls that block the access to the phone from outside of the private network, or to call it, from the Internet. To solve this, different operating systems has decided to open a connection from the device to their proprietary servers, and keep the connection open using this channel as a platform to communicate and receive Push notifications.
Both Apple and Google has used this approach. But that forces the phone to keep a permanent open connection with their server and to avoid the connection to be closed, either by the own TCP timers or by the NAT timers that they need to pass thought to reach the platform server, by sending small data packets, known as keep-alive, that do not have any valuable data and they are used only to say that some data is being transferred across the connection.
This solution has some serious issues:
- Keeping open connections on intermediate routers reduce notably the performance of these devices causing big scalability problems on mobile networks. For example, in Spain there are more than 15 million of smartphones connected to the Internet.
- Signaling storms: Mobile networks use a lot of signaling messages to manage the location of a device, their status, stablising new connections… Each time they send a small data packet, there are generated a big number of signaling events. The problem is that, multiplying the number of smartphones by the signaling produced by each device, networks are saturated just by the signaling messages, that lower the quality of service provided by the mobile networks.
- On the other hand, these solutions that a first glance are valid on traditional networks (like wired or WiFi home networks), are not valid on a cellular network, due to how operates a radio modem chosen for the mobile phones, because they are designed to lower the battery usage when they are not transmitting any data, however, this Push solutions used by the main operating systems, need to send keep-alive messages so they prevent the phone to enter to that idle state, or low energy consumption mode.
Collateral effects of these solutions are that the device batteries are consumed at quick speed to perform redundant operations (like sending small messages to prevent that their connections are closed).
Those problems are worse when we see that a lot of applications (most of them, on the top 10 of each market) create their own connections without using the technique provided by the operating systems, so they are literally multiplying problems with each new application.
So, what is happening on the mobile network?
As we said before, mobile networks do not work as the same way as a WiFi network, and designing previous solutions without considering first how this network work derives on the problems told in the previous section.
So, to really understand the problem, we need to know the different states under the radio modem can be:
On the 3GPP TS 23.060 specification, we can see all the things related with the states of the GMM layer (GPRS Mobility Management), that corresponds to the packet-switching domain on mobile devices. On the 3GPP TS 25.331 we can consult the information relative to the RRC layer (Radio Resource Control) where those radio levels are defined.
Joining the radio states and GPRS states we can know what is the actual state of the terminal.
NOTE: To simplify, in the next table we consider that there are no activity on the domain-switching circuit, so there are not any voice calls:
||Phone is transmitting or receiving data using a dedicated channel or a HSPA shared channel.
Cell_DCH timers are really small, so if there is no data transmitting or receiving during the past seconds, the timer will bring us to the Cell_FACH.
This timer is known as T1 and can vary between 5 and 20 seconds.
||The phone has been transmitting or receiving data some seconds ago, and due to inactivity (>T1) it has been moved from the state Cell_DCH to Cell_FACH.
If inactivity remains for more than T2 seconds, RRM will order the phone to move to the states Cell_PCH, URA_PCH or Radio Idle.
It’s also possible that the phone is transmitting or receiving small data packets, like pings, keep-alives or cell updates…
Usually, the T2 timer lasts for around 30 seconds.
|Cell_PCH o URA_PCH
||Phone has been on the Cell_FACH state some seconds ago and due to inactivity (>T2), RRM has moved from Cell_FACH to Cell_PCH or URA_PCH.
However, the signaling connection is still available, despite no data will be send right now.
If new data must be sent by the connection, it is not necessary to re create the connection, but reuse the old one.
||Cell_PCH o URA_PCH
||Phone is not transmitting data and signalling connection has been erased, however, the PDP context is still active and the phone has a valid IP address.
Those are the reasons why this state is one of the most interesting ones to keep the phone on, because battery usage is really low and it maintains their IP address so it can receive data from the network
On this state there are no resources wasted: nor network, nor battery, nor traffic… Despite this, phones can send and receive data at any moment.
As soon as the data link need to be stablished, by the phone or by the network, this can be easily changed the radio state from Cell_PCH or URA_PCH to Cell_FACH or Cell_DCH. This change is made in less than a half a second and consumes few signaling.
||This state is the same as above but in this case the radio is in Idle mode.
When the phone is in Cell_PCH or URA_PCH without any activity for more than the time stablished on T3, the RRM will move the radio from *_PCH to Idle.
Reestablish the radio link from this state will spend more than 2 seconds and a lot of signaling.
||Phone is not transmitting any data and there are no connection stablished, nor signaling. Phone does not have any PDP context also.
In this state, the phone does not have any IP address.
If a phone has a PDP context, probably it will be closed automatically after 24h of inactivity (not receiving or sending anything)
Battery consumption on each of this states is:
- RRC Idle – 1 relative unit of battery usage.
- Cell_PCH – less than 2 relative units.
- URA_PCH – less than Cell_PCH on mobility scenarios and the same on scenarios where there are no mobility
- Cell_FACH – 40 times IDLE consumption.
- Cell_DCH – 100 times IDLE consumption.
It is easy to see how previous solutions and the keep-alive packets prevent the device to stay long times on the IDLE state of low battery consumption.
Solution proposed by Telefónica
With all those problems on the table, and with the intention of making an operating system with new and innovative solutions, Telefónica Digital, with collaboration of Mozilla, has designed a new notification system that avoids to keep open connections inside a mobile network.
This new solution, implemented and distributed integrally open source, defines, not only how the notification server must communicate with the devices, but also the different APIs that need to communicate with itself.
To solve the previous problems, this solution is able to keep two different communication channels with the mobile devices, so when the device is on a network not managed by the carrier, for example, on the Wi-Fi at home, the connection is kept open, like others solutions do, using the HTML5 standard WebSockets, however, when the device is on a mobile network that is managed by the carrier, the WebSocket connection is closed by the server, and will wake up the device when there will be messages to it.
To wake up the phone, the notification platform is based in a single but elegant solution:
- We know that when the device establish a data connection, establishing a PDP context, its IP address (public or private) is kept by the carrier servers (GGSM) and not by the terminal itself, so, even if the device enters in a low comsumption mode, the IP address is not lost.
- When the device in on IDLE mode, but with a data connection enabled and the network need to send it some data (the GGSM has received a TCP or UDP packet for the terminal’s IP address) it sends a signaling message, known as PAGING used to “wake up” the phone and changing the IDLE mode. This PAGING message is similar to the used by the cellular network to notify the phone that it needs to attend a circuit-switching call (voice, SMS…)
- Using this way of working of the mobile networks, the only piece we have left to finish the puzzle is the ability to send a direct message to the phone, but, being inside the mobile network (with a private IP) it’s necessary to put a server inside each OB, or mobile networks.
So, this solution is used by the WakeUp server inside a mobile network, that will send a small UDP packet to the IP of the phone to “wake up” it. The phone, once received this message, will be waked up by the network and will connect to the notification server using the connection based on WebSockets to retrieve the pending messages.
Some examples about actual apps
We cannot disclose the names of the apps analyzed here, nor the names of the terminals in which they have been tested. But we can say that they are pretty popular apps, that you probably have installed on your phone, and the most popular phones in the market, with the most used operating systems.
On the following graphs, we can see the data consumption this applications have when they are on idle mode, which is the same to say: doing nothing. Colors indicate different terminals.
We can observe that some applications send small punctual sends, but other ones are transmitting constantly.
Study made by: Telefonica – NSN Smartphones lab
As we can see, the apps try to keep their connections open by sending “pings”, made in short intervals, stable in time. Meanwhile the first graphic shows an app that sends messages each 10 minutes, the second and the last one messages are sent continuously, making the phone to be always in the maximum state of the network (remember the table and the relative consumption), wasting resources just to say: “I’m alive”.
So, with our solution, this regular “pings” are completely removed, making a connection only when a notification is received by the phone, improving the use of the network, lowering the signaling and also making that phones’ battery leasts more, due to statying in less excited radio states.