A few weeks ago I released version 85 of AnySoftKeyboard and soon after, crash reports started to hit my email box. It usually happens - mostly because I forget to test a feature, or just because the specific device decided it is time to crash (it happens, really! The floor is crooked). But this time, all the crashes were of the same nature, and were many, about 5-10 a day.
The crashes were due to OutOfMemoryError (a.k.a, OOM), which usually means a memory leak in Android (why? Unlike a desktop, an Android device has very little heap to play with and leaking will fill that little space quickly), I started to look for them.

TL;DR: If I was able to recreate this crash on my devices, I would have find the cause immediately, and just fix the 6th cause in this list. But, I wasn't able to reproduce it (why? later on that), and it took me some time to understand that the reporting users want me to fix it, and are willing to help!

Cause 1: Context Leak

The most common leak in Android is the Context leak (here is a great explanation). Although all the examples are talking about Activity leaking, a Service (which AnySoftKeyboard is) has the same problems as Activity in that regard.

Solution

It's a no-brainer, but tedious: where ever I used a Context object, I made sure that instance is the Application's Context - getApplicationContext().

Cause 2: The Handler Leak OR the Inner Class

The inner-class (a.k.a. nested classes) in Java has an implicit property where it keeps a reference to its outer class, this is why you can call outer-class's functions in the inner-class scope. As long as the Handler is alive, so is the outer-class, and in my case - the whole Input Method Service, which in turn holds the entire tree!
One thing to note, once there are no messages in the Handler's loop, it will go out of reference in the Main Looper, and if there are no other references to it, well, it will be collected, with its implicit outer-class. So what is the problem? you may ask, and you are right. The problem is that some of the messages are delayed (or just still in queue), and as long as there is a message in the queue, the Handler will not be collected, so even if the OS will decide to kill AnySoftKeyboard (say, orientation change) and create a new one, the old one is still in memory, and we have two full trees at the same time. Maybe not for long, but enough time to cause OOM on some devices.

Solution

Very simple: gone over all the inner-classes I had (including the Handlers) and made sure they are static classes, and if they required a reference to the the outer-class, I used WeakReference:

In the code snippet above, I used a WeakReference field to keep a pointer to the main AnySoftKeyboard instance. By using this pattern I was able to call methods of AnySoftKeyboard, but also made sure I was not keeping AnySoftKeyboard from being garbage-collected if needed.
Do remember that if using this pattern it is required to check that the weak reference is still pointing to something (e.g., not null).

Cause 3: The Drawable Callback

Most of the Drawables I use are not attached to Views, but some are, and since I keep reference to the drawables objects, this was still a possible leak. Why? The drawable requires that its client (usually a View) will implement the Drawable.Callback interface, which is used by the Drawable to perform animation related tasks. So, if the View is removed from the window/activity, but Drawable is still referenced, then the View is still referenced too, and will not be garbage-collected.

Solution

I explicitly unbind the drawables when the keyboard's View is no longer needed: This was, actually, not required in my case since the only reference to the Drawable was the View which is the Callback for the Drawable.
Commit (look for changes in AnyKeyboardBaseView.java).

Cause 4: The External Context

This is special for AnySoftKeyboard, most apps do not reference to an external package's Context, but since AnySoftKeyboard supports external packages (i.e., language packs, themes, etc.), I was keeping a reference to that external Context always: let's say you use three languages (not very rare or the users of AnySoftKeyboard), and want to use the Ice Cream Sandwich theme, you'll end up with four external Context objects sitting around in AnySoftKeyboard heap. This is quite a waste, since most of the time I use the external Context only once or twice!

Solution

I moved the external from an explicit reference, to a WeakReference, and added the package's name to class's fields so I'll be able to create the context if needed.
Commit.

Cause 5: The Too Large Objects

Key background, Shift, Control, Enter, Space, Tab, Cancel, Globe, Microphone, Settings, Arrows and more. All these are Drawables which were loaded into the keyboard View, and were kept in memory. About 15 of them, and some with various states (like the Enter key, which has a Normal, Search, Done, Go, etc states, each a full drawable). That's not a leak, but it is still a lot of memory for some devices - HTC Desire will not allow more than 32MB of heap, for example.

Solution

I created a DrawableBuilder class which have all the information required to get the drawable. This way, if the layout does not use a specific icon, it will not be loaded into memory.
Commit.

Cause 6: The Database Connection

After I fixed all the issues above, the app was still crashing on some devices. I had no idea where to look anymore, I was sure there was a leak, and it very elusive, maybe even vendor specific (although the crashes came from many vendors, and various OS versions), so I added to the UncaughtExceptionHandler a code that check of the thrown Exception type, and if it is OutOfMemoryError, I asked the framework do create a memory dump:
I found the leak easily using Eclipse Memory Analyzer (a.k.a, MAT): the leak was a Database Connection Transport, and a huge one, each such transport used 0.5MB (for that user - since he had a large user-dictionary), and it leaked every time the user switched language! It was leaking due to a race-condition.

The Race Condition

When a dictionary is created, it loads its words list using an AsyncTask (so it wont hold up the UI thread), but when the list is long and the device is slow, the dictionary's close method (which is called when the language changes) may be called before the loading is done, hence not closing anything!

So, if it happens always why haven't I recreated it on my devices? Some flavors (Samsung devices and CyanogenMod) could handle that release automatically (there is a finalizer in Java, you know), and they did. Some vendors did not. I had a Samsung, a Motorola and a device running CyanogenMod. Moreover, my devices are fast just enough, and used a small words list. Bummer, ah? Ya..

Solution

Just closed the database connection: by ensuring the related functions use monitors.
Commit and commit.

The End

v95

blog comments powered by Disqus