Monday, November 22, 2010

Performance Comparison: Casting (Boxing/Unboxing) vs Parsing

What can I say, I like to test things! This test was to answer some questions we had in trying to determine what would be the most effective way to store a value of unknown type.

Here's the back-story. John and I are building a graph model for analysis purposes and one object in that model represents a property of an entity (node/vertex), but we can't agree to how we should store the value:
public interface Property {
   //The type of the property, similar to 
   //ontology (eg: entity.person.age)
   String getType();
   //The human readable label of a property (eg: Age)
   String getLabel();
   //The Entity (node/vertex) this property applies to
   Entity getSource();
   /* The value of the entity */
   //? - Preserves object, but you can't readily transport it
   //and it limits the compatibility of the model
   Object getValue(); 
   //? - Flexible and transportable, but there is a
   //cost to parsing
   String getValue(); 
}

There are many arguments for and against both strategies. Most engineers would naturally choose a string because it can be transported between languages (say SOAP) with little effort. However, we are looking at potentially millions of these property objects in a graph, and the performance cost of parsing could overrule the need to be flexible.

So, the question is, how much of a penalty do we incur parsing a string vice casting an object?

Before I display the test, I'm going to be transparent and show you my trusted StopWatch class that I use to calculate the performance times within the test:
/**
 * Here is a stupid-simple stop 
 * watch class I use all the time
 * to calculate the length of 
 * execution in my applications.
 * 
 * I probably should make this a 
 * little more elegant since I use
 * it so much.
 *
 * @author Richard Clayton 
 * (Berico Technologies)
 */
public class StopWatch {
 
   //Start time; cleared when the 
   //stop watch is reset
   protected long startTime = 0;
   //Elapsed time
   protected long elaspedTime = 0;
 
   /**
    * Instantiate the stop watch 
    * (does nothing)
    */
   public StopWatch(){}
 
   /**
    * Start the Stop Watch
    */
   public void start(){
      this.startTime = System.nanoTime();
   }
 
   /**
    * Stop the Stop Watch 
    */
   public void stop(){
      if(startTime != 0){
         this.elaspedTime 
            = System.nanoTime() - this.startTime;
      }
   }
 
   /**
    * Reset the Stop Watch
    */
   public void reset(){
      this.startTime = 0;
   }
 
   /**
    * Get the elapsed time (did you 
    * start and stop the stop watch?)
    * @return Elapsed time in nanoseconds
    */
   public long getElapsedNanoSeconds(){
      return this.elaspedTime;
   }
}

OK, now that I've showed you how I'm going to track execution time, here is the test setup:
public static void main(String[] args) {
   //Number of iterations we are going to perform
   int ITERATIONS = 1000000;
   //Create a StopWatch (my simple implementation)
   StopWatch stopWatch = new StopWatch();
   //We are going to use a number of types 
   //(int, double, bool, and date)
   String intString = Integer.toString(Integer.MAX_VALUE);
   String doubleString = Double.toString(Double.MIN_VALUE);
   String boolString = Boolean.toString(true);
   String dateString = 
      (new Date(System.currentTimeMillis())).toString();
   //Again, here are the same times with the same value 
   //(well except that last Date
   //which will be off by a couple nanoseconds)
   Object intObj = Integer.MAX_VALUE;
   Object doubleObj = Double.MIN_VALUE;
   Object boolObj = true;
   Object dateObj = 
      new Date(System.currentTimeMillis());
   //We'll store our results here
   int intResult;
   double doubleResult;
   boolean boolResult;
   Date dateResult;
   //Start the test!
   stopWatch.start();
   //Perform the String parsing test
   for(int i = 0; i < ITERATIONS; i++){
      intResult = Integer.parseInt(intString);
      doubleResult = Double.parseDouble(doubleString);
      boolResult = Boolean.parseBoolean(boolString);
      dateResult = new Date(Date.parse(dateString));
   }
   //Stop the test
   stopWatch.stop();
   //Print out the results
   System.out.println(
      String.format("The time to perform the 
                     parsing of %s strings was 
                     %s nanoseconds 
                     [average per iteration was %s ns].", 
                     ITERATIONS * 4, 
                     stopWatch.getElapsedNanoSeconds(),
                     stopWatch.getElapsedNanoSeconds() 
                      / ITERATIONS));
   //Reset the stopwatch
   stopWatch.reset();
   //Start again
   stopWatch.start();
   //Perform the casting test
   for(int i = 0; i < ITERATIONS; i++){
      intResult = (Integer)intObj;
      doubleResult = (Double)doubleObj;
      boolResult = (Boolean)boolObj;
      dateResult = (Date)dateObj;
   }
   //Stop the test
   stopWatch.stop();
   //Print out the results
   System.out.println(
      String.format("The time to perform casting of 
                     %s objects was %s nanoseconds 
                     [average per iteration was %s ns].", 
                     ITERATIONS * 4, 
                     stopWatch.getElapsedNanoSeconds(), 
                     stopWatch.getElapsedNanoSeconds() 
                        / ITERATIONS));
   }
Nothing special here, just a lot of really simple operations. So let's look at the results...
1000 Iterations

The time to perform the parsing of 4000 strings was 88494000 nanoseconds [average per iteration was 88494 ns].
The time to perform casting of 4000 objects was 108000 nanoseconds [average per iteration was 108 ns].

10000 Iterations

The time to perform the parsing of 40000 strings was 366990000 nanoseconds [average per iteration was 36699 ns].
The time to perform casting of 40000 objects was 1131000 nanoseconds [average per iteration was 113 ns].

100000 Iterations

The time to perform the parsing of 400000 strings was 542090000 nanoseconds [average per iteration was 5420 ns].
The time to perform casting of 400000 objects was 5839000 nanoseconds [average per iteration was 58 ns].

1000000 Iterations

The time to perform the parsing of 4000000 strings was 2647509000 nanoseconds [average per iteration was 2647 ns].
The time to perform casting of 4000000 objects was 5475000 nanoseconds [average per iteration was 5 ns].

Casting is definitely faster than parsing, but I was happy to see that parsing was not too much of a performance bottleneck. In the end, we decided to use the Object strategy instead of String, but the results of the test were encouraging for the use of strings if you are facing a similar problem. In the end, I think the decision on which to use is best decided by your transportation/compatibility requirements. If you have to xml-serialize and send that object to a SOAP endpoint, a String is probably the better solution.

Rich

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.