вторник, 28 февраля 2017 г.

Java 8. Object.hashCode default implementation

Usually it is considered that default Object.hashCode and System.identityHashCode are calculated as value converted of object address to int. But this is not a strict rule. In fact in JVM8 there are possibilities to choose the implementation of Object.hashCode with option -XX:hashCode=0..5. The list of functions can be found at http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/share/vm/runtime/synchronizer.cpp in method get_next_hash. Here is the list of possible values:

0 - unguarded global Park-Miller RNG;
1 - function on bits;
2 - constant 1;
3 - sequence number;
4 - object address;
5 - Marsaglia's xor-shift scheme with thread-specific state.

In JVM 8  by default is used Margalia's xor-shift option. See http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/runtime/globals.hpp and line:
product(intx, hashCode, 5,"(Unstable) select hashCode generation algorithm").       


суббота, 18 февраля 2017 г.

Java 8: Integer

Everyone knows that Java has boxing functionality:
 Integer one = 4;
 Integer two = 4;
 Assert.assertTrue(one==two);

 Integer $600 = 600;
 Integer also600 = 600;
 Assert.assertTrue($600 != also600);


It works as written in JLS (http://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html#jls-5.1.7).
This is due to the caching functionality inside the Integer class and the method Integer.valueOf which is used by boxing.

What's more we can make the second test pass. Cache inside Integer can be tuned by passing  system property -Djava.lang.Integer.IntegerCache.high=600 to the JVM. The bad thing is that low bound of Integer cache cannot be tuned.

Another example is:
 Integer $4 = new Integer(4);
 Integer $anotherInstance = 4;
 Assert.assertTrue($anotherInstance!=$4);

This is because new Integer(4) bypasses the cache.  Why? Is it a good design?
Is it the premature optimization evil?
Interesting case that other wrapper classes also contain the caching functionality, but only Integer class has the possibility to tune the size of the cache.

One more thing: do we actually need Integer constructor which accepts String if the same goal can be achieved with Integer.valueOf method?

вторник, 14 февраля 2017 г.

Java StringBuilder and StringBuffer

As to know more about the implementation of the JVM I started to look at the java source code. StringBuilder and StringBuffer are two pretty often used classes.
Documentation says that it is recommended to use StringBuilder class as it will be faster under most implementations. But.. documentation does not say about possible performance gains.
I wrote a quick check for appending String and char for both classes.
For the tests I used Oracle JDK 1.8.0b51.  Here is the sample code:
    @Test
    public void stringBufferTest(){
        Eap.execute(()->{
            StringBuffer stringBuffer = new StringBuffer();
            String a = "a";
            for (int i = 0; i< $256_MEGABYTES; i++){
                stringBuffer.append(a);
            }
        });
    }
    @Test
    public void stringBuilderTest(){
        Eap.execute(()->{
            StringBuilder stringBuilder = new StringBuilder();
            String b = "b";
            for (int i = 0; i< $256_MEGABYTES; i++){
                stringBuilder.append(b);
            }

        });
    }

And results:
Time is: 9,105000 seconds
Time is: 3,220000 seconds

Absolute numbers tell nothing, so StringBuilder is almost 2.8 faster then StringBuffer. 

For chars I just substitute String a = "a" to char a = 'a' and String b = "b" to char b = 'b'.

In case of char appending we get following results:
Time is: 8,623000 seconds
Time is: 1,221000 seconds
So StringBuilder is almost 7 times faster then StringBuffer. 

Although these results were obtained on relatively long strings it is much more clear now which class to use in your application (depending on the requirements of course). Interesting fact is that both StringBuffer and StringBuilder extend from AbstractStringBuilder.

A couple of questions to the internal design of these classes:
1) both of them expose internal capacity, which is 16 by default. Why it is 16? What is this magic number for?
2) Overloaded constructors with different semantics. So, one constructor accepts capacity and another one accepts string.
3) Both classes provide methods capacity() and length(). Both of them return data. Why didn't they call them with get prefix like getCapacity() and getLength()?

воскресенье, 12 февраля 2017 г.

Validate XML by XSD with includes from classpath

Validation of XML files is usually a standard task.
It is very convenient to store XSDs in classpath and validate XML in a such way.
Problems starts when XSD contains imports of other schemas. Schema parser doesn't know how to resolve the location of imports. There is no out of box solution to this problem.

I created a sample project that solves this task: 
https://github.com/chernykhalexander/xsdValidator