Chitika

Wednesday, September 15, 2004

optimizing collection usage

We've been playing with Collection classes for years now. We used to play with Vector & Hashtable, before we were given ArrayList, HashMap, and the new Linked* classes. But the question still remains, have we use them properly and efficiently?

I believe that not many developers have realized this. Improper usage of Collection may cause unnecessary memory allocation & performance degradation. Especially when involving loop or recursive calls.

For example, if we're required to create an instance of Map interface which contains a single entry (key-value pair), and we know that this Map instance will never be altered, then the best way to go is by using the singletonMap() method in the Collections utility class.

Map param = Collections.singletonMap ("USER_ID", stUserID)

instead of creating the default HashMap implementation, which will automatically allocate 16 entries into the memory usage.

Map param = new HashMap ();
param.put ("USER_ID", stUserID);

By using the default HashMap implementation, we're allocating more memory than we need, and we perform tasks which we may never need. The similar method exists for List as well.

Still with the same "optimize while initializing" spirit, both HashMap and ArrayList should be instantiated efficiently, to prevent the overhead of auto-resizing implementation of both classes. For example, an ArrayList will automatically expand itself, including copying the old elements of the internal array to the newly created internal array. This cause overhead in the number of instructions that'll need to be executed as well as the extra memory to be allocated. Similar conditions apply to HashMap as well.

If we have known for sure what's the number of elements to be included within the Collection, it'd save a good number of instructions to be executed as well as unnecessary amount of memory to be allocated.

For example, to convert elements of a List from a Class type to another Class type (e.g., from Transfer Object to ActionForm), we'd known what the size is beforehand.

public static List convert (List transferObjects) {
  List result = new ArrayList (transferObjects.size() + 1);
  for (Iterator it = transferObjects.size();
       it.hasNext(); ) {
    TransferObject obj = (TransferObject) it.next ();
    ActionForm bean = ActionFormFactory.getInstance (
        "xForm");

    // assuming that there's another private method
    // dealing with the conversion for each field
    convert (obj, bean);
    result.add (bean);
  }
}

Same with a HashMap, whose default load factor is 0.75. Here's my trick:

// the number of entries known to be inserted
final int KNOWN_CAPACITY = 3;
final int INITIAL_CAPACITY =
    1 + (KNOWN_CAPACITY + 1) * 4 / 3;

Map params = new HashMap (INITIAL_CAPACITY);
params.put ("USER_ID", stUserId);
params.put ("LOGIN_ROLE", stLoginRole);
params.put ("KEYWORD", stKeyword);

Hopefully, using the above approaches, I can minimize (or prevent) the unnecessary auto-resizing & extensive memory allocation.
I think it's very easy to be done, and it has a good impact on the performance & quality of the code.
What do you think?


3 comments:

  1. I always keep searching for such informative blogs. Can you please suggest any link or book where i can read about the usage of different suitable collection classes in different scenarios.

    ReplyDelete
  2. Some of the tricks I did here are based on looking at the implementation source code of the JDK. The rest are coming from the Effective Java book of Joshua Bloch, and the SCJP Study Guide (I don't remember the exact book title) of Khalid Mughal.

    Were you looking for other tricks?

    ReplyDelete
  3. Just read few of her blogs. Awesome ones :-)

    I am actually reading both books you suggested these days :)

    ReplyDelete