PORTING CHALLENGES & TEST ENVIRONMENT – Code Shoppy
Not all malloc implementation discussed in this paper are available to Android. Some of them are only meant to be working on Windows and/or x86 platform. So we have ported tlsf, musl, tcmalloc, nedmalloc to Android platform. As a part of porting exercise, we has wrote atomic routines, xchg, bit scan forward routines for ARM, fixed c99 warning, , corrected name mangling and implemented malloc_disable, malloc_usable_size,mallinfo calls, which are required for Android.
We have used Hikey community board supported by Android Open Source Platform as test target.
A. Test Benchmarks ●Size measurement for static library/binary – libc.so, init, adbd, for which malloc is inclusive. This to measure and compare space complexity for different malloc implementations. ●gperftool – malloc. Test suite allocates memory indifferent chunk size, store the allocated memory pointers in different way and then access and free the memory in linear and random pattern. Lower the numbers better the performance for this benchmark. ●T-test1 – Benchmark creates single/multiple threads, and allocates memory in random sizes. Lower the numbers better the performance for this benchmark. ●tlsf-test (malloc, free and realloc). Benchmark keeps on allocating memory till defined duration, and the measure avg. allocation and free time for the operations. Lower the numbers better the performance for this benchmark.
B. Fragmentation Impact To measure the fragmentation effect of memory allocators,the experiment has also collected and analyzed performance data for running use case environments. The experiment carried out for top three widely used memory allocators. Data has been collected for following use case and idle condition: ●Lunch browser and load three different website at regular interval ●Download 5 sample images from remote location ●Load images to gallery at regular interval ●Start video playback for 15 sec. ●Capture the memory data ○Collect buddy info, memory consumed and memory free ●Stop Use Case (Idle case) ○Stop & Kill all process which are launched by use case ○Capture memory data ●Run the iteration 50 times. Take average and standard deviation
C. The Analysis – Benchmarks ●tlsf wins in space improvement for libc (Figure 1). ●tcmalloc is much faster than nedmalloc and native android malloc (jemalloc) when no of allocations are more and random in size, mostly across the pages(Figure 2). When allocations are less and linear,dlmalloc and nedmalloc have faster response time for malloc operation (Figure 2). ●Initial memory allocation is more in case of tcmalloc,but runtime requirement for application gets managed within initial pool. E.g. with tcmalloc, launching a browser with default page (www.google.com) takes~10KB less memory compared to default malloc implementation in bionic libc (Major reduction in runtime system memory + native heap requirement) ●dlmalloc has considerable static size benefits over tcmalloc and jemalloc. Also when number of allocations are less and linear, dlmalloc has better performance(Figure 2)
D. The Analysis – Fragmentation ● jemalloc case shows more free memory, but having larger deviation for each run ●After doing further analysis on free page size, jemallochas high no of 32,64.. KB of pages available for both idle and running state, but lower order pages are more for tcmalloc. This could be one of the reason fort cmalloc performance for low page allocations. ●jemalloc and nedmalloc has more RAM available for idle & running state but with high deviation. While running state show tcmalloc has good amount of lower pages available for quicker allocation