Monday, June 10, 2013

Guava - simple recipes to make your Java code cleaner, 1st part

It isn't article for peoples knowing Guava. It is set of simple examples to encourage to use Guava Library in your code.

#1. You can use Optional instead of simply returning null in some specific cases:

Insted of:
     /**  
     * Can return null in specific cases... but it's
     * hard to remember
     */
    public static String someMethod() {
        String returnValue = null;
        if (new Date().getTime() % 2 == 0) {
            returnValue = "time % 2 == 0";
        }
        return returnValue;
    }

    public static void main(String[] args) {
        String str = someMethod();
        str.contains("%") // will crash
    }
use this:
    /**
     * Explicite shows, that method can
     * return empty (null) value
     */
    public static Optional< String > someMethod() {
        Optional< String > returnValue = Optional.absent();
        if (new Date().getTime() % 2 == 0) {
            returnValue = Optional.of(new String());
        }
        return returnValue;
    }

    public static void main(String[] args) {
        Optional< String > str  = someMethod();
        if(str.isPresent()) {
            // here you know that value is not null
        }
        // or you can operate on given value or a default one
        str.or("default value").contains("%");
    }

#2 You can use firstNonNull from Guavas Objects class insted of write "if else"

Insted of:
    public T foo() {
        ...
        ...
        if(first != null) {
            return first;
        } else {
            return second;
        }
    }
use this:
import static com.google.common.base.Objects.firstNonNull;
...
    public T foo() {
        ...
        ...
        return firstNonNull(first, second);
    }

#3 You can use Guava Strings class methods to deal with null or empty Strings

Insted of:
if(str == null) {
    str = "";
}

if("".equals(str)) {
    str = null;
}

if(str == null || str.length() == 0) {
   // is null or empty
}
use this:
import static com.google.common.base.Strings.*;

str = nullToEmpty(str);
str = emptyToNull(str);
if(isNullOrEmpty(str)) {
   // is null or empty
}

#4 You can use Object.equals(a, b) to check equality safely

Insted of:
a.equals(b); // will crash if a is null
use this:
import static com.google.common.base.Objects.equal;
...
equal(a, null); // return false
equal(null, null); // return true
equal(a, b); // return true if a is equal b

#5 You can use Joiner to join Strings

Insted of:
StringBuffer buffer = new StringBuffer();
for (String str : strs) {
    if (str != null) {
        buffer.append(str);
        buffer.append(", ");
    }
}
if (buffer.length() >= 2) {
    buffer.substring(0, buffer.length() - 2);
}
return buffer.toString();
use this:
import com.google.common.base.Joiner;
...
return Joiner.on(", ").skipNulls().join(strs);

#6 You can use Splitter to split String

Insted of:
String str = "abc, bcd,, cde   ,zsa";
String[] split = str.split(",");

// What with trimming? Whitespaces? Empty strings???
use this:
import com.google.common.base.Splitter;
...
Splitter.on(',')
       .trimResults()
       .omitEmptyStrings()
       .split("abc, bcd,, cde   ,zsa");

#7 You can use Multiset to count object occurences

Insted of:
Map< String, Integer > countMap = new HashMap< String, Integer >();
for (String word : words) {
    if(!countMap.containsKey(word)) {
        countMap.put(word, 0);
    }
    countMap.put(word, countMap.get(word) + 1);
}
use this:
import com.google.common.collect.HashMultiset;
import com.google.common.collect.Multiset;
...
Multiset< String > wordsMultiset = HashMultiset.create();
wordsMultiset.addAll(words);

#8 You can use Multimap insted of map with List or Set as values

Insted of:
Map< String, List< String > > languagesMap = new HashMap< String, List< String >>();
for (Programmer programmer : programmers) {
    if (languagesMap.get(programmer.getLanguage()) != null) {
        languagesMap.put(programmer.getLanguage(), new ArrayList< String >());
    }
    languagesMap.get(programmer.getLanguage()).add(programmer.getEmail());
}
use this:
import com.google.common.collect.HashMultimap;
import com.google.common.collect.Multimap;
...
Multimap< String, String > languagesMap = HashMultimap.create();
for (Programmer programmer : programmers) {
    languagesMap.put(programmer.getLanguage(), programmer.getEmail());
}

...next simple, but usable examples soon...

Wednesday, May 22, 2013

Java GC tuning for High Frequency Trading apps

I am interested in Java performance issues for some time. Now we look not so much at the Java code performance but at Garbage Collection process performance. I am inspired by a lecture from a Warsaw JUG group, which was about Hot Spot in low latency Java. Lecturer Wojciech Kudla prepared a simple small application to simulate HFT system in GC point of view. All the source code with great GC Viewer (plugin for JVisualVM) is available on thier GitHub: heaptrasher.

In a nutshell heaptrasher is an application which is used to generate a lot of garbage to GC operate on. In addition it contains some code to generate histogram with latency statistics. Mentioned statistics data can be collected in two different ways. Worth mentioning is that the first one (array) allocate a lot of memory in Old Generation space and the second does not. I was later to find, it has a considerable impact on GC performance.

All charts bellow are from JVisualVM with mentioned GC Viewer and VisualGC plugins.

Test platform

  • CPU: i7-3612QM
  • RAM: 8GB
  • OS: Windows 7 Ultimate
  • JDK: 1.7.0_u21

A common feature of all the tests is to use parallel GC for Young Space collections. The GC for Old Generation doesn't matter because in this case Old Generation collection will never happen.

First test

Description

4GB heap, array as statistics data holder

Command line

java -XX:+UnlockDiagnosticVMOptions -Xmx4g -Xms4g -XX:+UseParallelGC Main array

Garbage Collector pauses duration

Garbage Collections times

Message latency

 50,0000% of message latency is less than: 1466 nanos
 75,0000% of message latency is less than: 2443 nanos
 90,0000% of message latency is less than: 3421 nanos
 99,0000% of message latency is less than: 4888 nanos
 99,9000% of message latency is less than: 11241 nanos
 99,9900% of message latency is less than: 19549 nanos
 99,9990% of message latency is less than: 75263 nanos
 99,9999% of message latency is less than: 3480174 nanos
 

Summary

6,5 ms pauses? It isn't low latency :(. The graph is also very irregular. 99,9% messaged was processed in below 11241 nanos.

Second test

Description

4GB heap, direct statistics data collection

Command line

java -XX:+UnlockDiagnosticVMOptions -Xmx4g -Xms4g -XX:+UseParallelGC Main direct

Garbage Collector pauses duration

Garbage Collections times

Message latency

 50,0000% of message latency is less than: 1466 nanos
 75,0000% of message latency is less than: 2443 nanos
 90,0000% of message latency is less than: 2933 nanos
 99,0000% of message latency is less than: 4887 nanos
 99,9000% of message latency is less than: 11240 nanos
 99,9900% of message latency is less than: 19060 nanos
 99,9990% of message latency is less than: 82104 nanos
 99,9999% of message latency is less than: 762892 nanos
 

Summary

Hmmm much better, isn't it? It appears that Old Generation is major low-latency apps enemy... The graph is irregular like in first case. Change does not affected the 99,9% of messages process time.

Third test

Description

4GB heap, direct statistics data collection, 3GB NewSize

Command line

java -XX:+UnlockDiagnosticVMOptions -Xmx4g -Xms4g -XX:MaxNewSize=3g -XX:NewSize=3g -XX:+UseParallelGC Main direct

Garbage Collector pauses duration

Garbage Collections times

Message latency

 50,0000% of message latency is less than: 1466 nanos
 75,0000% of message latency is less than: 2443 nanos
 90,0000% of message latency is less than: 3421 nanos
 99,0000% of message latency is less than: 4888 nanos
 99,9000% of message latency is less than: 11241 nanos
 99,9900% of message latency is less than: 20038 nanos
 99,9990% of message latency is less than: 69887 nanos
 99,9999% of message latency is less than: 652930 nanos
 

Summary

In general changes does not affected a single collection time, but the entire collections count is more than 2 times less. It is interesting that increase Young Generation space does not affected single collection time. It is just because app generates a lot of short lived objects, that means GC has relative small graph of living objects to check. One more time 99,9% of messages are processed with the same latency.

Third test

Description

4GB heap, direct statistics data collection, 3GB NewSize, number_of_gc_threads=cpu_core_number-1

Command line

java -XX:+UnlockDiagnosticVMOptions -Xmx4g -Xms4g -XX:MaxNewSize=3g -XX:NewSize=3g -XX:ParallelGCThreads=7 -XX:+UseGCTaskAffinity -XX:+BindGCTaskThreadsToCPUs Main direct

Garbage Collector pauses duration

Garbage Collections times

Message latency

 50,0000% of message latency is less than: 1466 nanos
 75,0000% of message latency is less than: 2443 nanos
 90,0000% of message latency is less than: 3421 nanos
 99,0000% of message latency is less than: 5375 nanos
 99,9000% of message latency is less than: 11729 nanos
 99,9900% of message latency is less than: 21015 nanos
 99,9990% of message latency is less than: 70865 nanos
 99,9999% of message latency is less than: 612855 nanos
 

Summary

It isn't a lot of changes in GC pause times. Must admit that chart is more regular. We have also the first time for a long time change in 99,9% messages processing time.

Fourth test

Description

4GB heap, direct statistics data collection, 3GB NewSize, number_of_gc_threads=cpu_core_number-1 with ParGCCardsPerStrideChunk

Command line

java -XX:+UnlockDiagnosticVMOptions -Xmx4g -Xms4g -XX:MaxNewSize=3g -XX:NewSize=3g -XX:ParallelGCThreads=7 -XX:+UseGCTaskAffinity -XX:+BindGCTaskThreadsToCPUs -XX:ParGCCardsPerStrideChunk=32768 Main direct

Garbage Collector pauses duration

Garbage Collections times

Message latency

 50,0000% of message latency is less than: 1466 nanos
 75,0000% of message latency is less than: 2443 nanos
 90,0000% of message latency is less than: 3421 nanos
 99,0000% of message latency is less than: 5375 nanos
 99,9000% of message latency is less than: 10752 nanos
 99,9900% of message latency is less than: 21503 nanos
 99,9990% of message latency is less than: 76729 nanos
 99,9999% of message latency is less than: 629960 nanos
 

Summary

It look almost the same as last time but we have significant better performance in 99,9% messages process time.

Summary

Durning the presentation lecturer use Linux based platform. During presentation the lecturer was used Linux based platform. In Linux it look that individual switches has more impact for collections time. A little summary here:

Test number Collections number Collections time Avg. collection time 99,90% latency less than 99,99% latency less than
first126706,342 ms5,60 ms11241 nanos19549 nanos
second12692,002 ms0,73 ms11240 nanos19560 nanos
third5753,738 ms0,94 ms11729 nanos21015 nanos
fourth5746,508 ms0,81 ms10752 nanos21503 nanos

Warsaw JUG meeting

More info???

Thursday, May 9, 2013

What's next?

I stopped with Scala. It is really powerful, amazing and flexible language but now I can not find its use in everyday work. It doesn't mean that it was wasted time. I think this lecture broadened my horizons significantly. What's next? For some time I'm interested in performance tuning in Java. It just so happens that it was talk about this on Warszawa JUG meeting recently. The Speaker had prepared program that simulates load encountered in High Frequency Trading applications. He showed how can be the performance affected by setting various JVM parameters. Application code is available on GitHub. In the near future I'm going to try this.
In addition next week I'm going to Krakow for the GeeCON conference and after that I'm going to describe my impressions there...

Monday, April 15, 2013

Guavas Optional... I know where are you from :)

Consider the example of a function that calculates the square root of x. That's make sense if try to calculate square root of a number which is equal or greater than zero, but what if it isn't? Return -1 or throw exception? This does not sound good... Fortunately google-guava provides more elegant solution, which is Optional class. Let the code speak:
import com.google.common.base.Optional;

public class Example {

    public static Optional< Double > sqrt(Double arg) {
        Optional< Double > result = Optional.absent();
        if(arg >= 0.0) {
            result = Optional.of(Math.sqrt(arg));
        }
        return result;
    }

    public static void main(String[] args) {
        System.out.println(Example.sqrt(-2.0));
        System.out.println(Example.sqrt(2.0));
    }
}
Returning Optional allow us to explicite showing, than in certain circumstances function can return result which dosen't make sense or it's empty or whatever else. What the best? In Scala, this functionality is part of the language... and it is in quite nice and readable form:
class Example {

  def sqrt(arg : Double) = {
    if (arg>=0) {
      Some(math.sqrt(arg))
    } else {
      None
    }
  }
}

object Main {
  def main(args: Array[String]) {
    val example = new Example
    println(example.sqrt(-2))
    println(example.sqrt(2))
  }
}

Wednesday, April 10, 2013

Passing arguments by value or by name? What's difference?

In Scala there are two ways to pass values to function. You can pass it by name or value. I try to explain and show examples depicting this issue by write equivalent java code. Let's see example class writen in Scala:
class Example {
  var invocationCount = 0

  def byValue(arg: String) = {
    println(arg)
    println(arg)
    println(arg)
  }

  def byName(arg: => String) = {
    println(arg)
    println(arg)
    println(arg)
  }


  def getMessage(): String = {
    invocationCount += 1
    "invocation count: " + invocationCount
  }
}


object Main {
  def main(args: Array[String]) {

    val example = new Example
    println(example.byValue(example.getMessage()))
    println(example.byName(example.getMessage()))
  }
}
... and output:
invocation count: 1
invocation count: 1
invocation count: 1
()
invocation count: 2
invocation count: 3
invocation count: 4
()
As seen in the first case (method byValue), the method getMessage was evaluated only one time, it result was passed to method byValue and then written three times on screen. And in second case it's a little bit different. In simple terms, when arg was passed by name it is look like passing function as a argument like in JavaScript or like a clousure in Groovy. What about do that in Java? Unfortunately it is not possible to do it in such an elegant and readable way in Java, but is possible to simulate. You can do that by using interface. Let's see Java solution:

// IFunction.java
public interface IFunction< T > {

    T call();
}

// Example.java
public class Example {

    private int invocationCount;

    public void byValue(String arg) {
        System.out.println(arg);
        System.out.println(arg);
        System.out.println(arg);
    }

    public void byName(IFunction< String > function) {
        System.out.println(function.call());
        System.out.println(function.call());
        System.out.println(function.call());
    }


    public String getMessage() {
        return String.format("invocation count: %d", ++invocationCount);
    }

    public static void main(String[] args) {
        final Example example = new Example();
        example.byValue(example.getMessage());
        example.byName(new IFunction< String >() {
            @Override
            public String call() {
                return example.getMessage();
            };
        });
    }
}
It look not so pretty as in Scala or Groovy. It remains to wait for Java 8 with Lambda Expressions...

Monday, April 8, 2013

It's time to dive into Scala

After about two years experience in Java, now it's time for something new. I have tried a little bit of Groovy and it was really captivated me but I'm not entirely convinced by idea of dynamically typed language. Therefore I decided to try Scala which is statically typed language. Furthermore Scala provides full support for functional programming. It's all sounds great! I want to describe my attemps and reflections associated with this language on this blog. Sorry for my english, it's poor because I do not have any practical experience with it, and this is yet another reason for creation of this blog. I will be grateful for all the comments and feedback on both the Scala and my english.