ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Java RMI: Serialization
Pages: 1, 2, 3, 4, 5, 6

Comparing Externalizable to Serializable

Of course, this efficiency comes at a price. Serializablecan be frequently implemented by doing two things: declaring that a class implements the Serializableinterface and adding a zero-argument constructor to the class. Furthermore, as an application evolves, the serialization mechanism automatically adapts. Because the metadata is automatically extracted from the class definitions, application programmers often don't have to do anything except recompile the program.

On the other hand, Externalizableisn't particularly easy to do, isn't very flexible, and requires you to rewrite your marshalling and demarshalling code whenever you change your class definitions. However, because it eliminates almost all the reflective calls used by the serialization mechanism and gives you complete control over the marshalling and demarshalling algorithms, it can result in dramatic performance improvements.

To demonstrate this, I have defined the EfficientMoneyclass. It has the same fields and functionality as Moneybut implements Externalizableinstead of Serializable:

public class EfficientMoney extends ValueObject implements Externalizable {
public static final long serialVersionUID = 1;
protected int _cents;
 
public EfficientMoney(Integer cents) {
this(cents.intValue( ));
}
 
public EfficientMoney(int cents) {
super(cents + " cents.");
_cents = cents;
}
 
public void readExternal(ObjectInput in) throws IOException,
ClassNotFoundException {
_cents = in.readInt( );
_stringifiedRepresentation = _cents + " cents.";
}
 
public void writeExternal(ObjectOutput out) throws IOException {
out.writeInt(_cents);
}
}

We now want to compare Moneywith EfficientMoney. We'll do so using the following application:

public class MoneyWriter {
public static void main(String[] args) {
writeOne( );
writeMany( );
}
 
private static void writeOne( ) {
try {
System.out.println("Writing one instance");
Money money = new Money(1000);
writeObject("C:\\temp\\foo", money);
}
catch(Exception e){}
}
 
private static void writeMany( ) {
try {
System.out.println("Writing many instances");
ArrayList listOfMoney = new ArrayList( );
for (int i=0; i<10000; i++) {
Money money = new Money(i*100);
listOfMoney.add(money);
}
writeObject("C:\\temp\\foo2", listOfMoney);
}
catch(Exception e){}
}
 
private static void writeObject(String filename, Object object) throws
Exception {
FileOutputStream fileOutputStream = new FileOutputStream(filename);
ObjectOutputStream objectOutputStream = new
ObjectOutputStream(fileOutputStream);
long startTime = System.currentTimeMillis( );
objectOutputStream.writeObject(object);
objectOutputStream.flush( );
objectOutputStream.close( );
System.out.println("Time: " + (System.currentTimeMillis( ) - startTime));
}
}

On my home machine, averaging over 10 trial runs for both Moneyand EfficientMoney, I get the results shown in Table 10-1. (We need to average because the elapsed time can vary (it depends on what else the computer is doing). The size of the file is, of course, constant.)

Table 10-1: Testing Money and EfficientMoney

Class

Number of instances

File size

Elapsed time

Money

1

266 bytes

60 milliseconds

Money

10,000

309 KB

995 milliseconds

EfficientMoney

1

199 bytes

50 milliseconds

EfficientMoney

10,000

130 KB

907 milliseconds

These results are fairly impressive. By simply converting a leaf class in our hierarchy to use externalization, I save 67 bytes and 10 milliseconds when serializing a single instance. In addition, as I pass larger data sets over the wire, I save more and more bandwidth--on average, 18 bytes per instance.

TIP:   Which numbers should we pay attention to? The single-instance costs or the 10,000-instance costs? For most applications, the single-instance cost is the most important one. A typical remote method call involves sending three or four arguments (usually of different types) and getting back a single return value. Since RMI clears the serialization mechanism between calls, a typical remote method call looks a lot more like serializing 3 or 4 single instances than serializing 10,000 instances of the same class.

If I need more efficiency, I can go further and remove ValueObjectfrom the hierarchy entirely. The ReallyEfficientMoneyclass directly extends Objectand implements Externalizable:

public class ReallyEfficientMoney implements Externalizable {
public static final long serialVersionUID = 1;
protected int _cents;
protected String _stringifiedRepresentation;
 
public ReallyEfficientMoney(Integer cents) {
this(cents.intValue( ));
}
 
public ReallyEfficientMoney(int cents) {
_cents = cents;
_stringifiedRepresentation = _cents + " cents.";
}
 
public void readExternal(ObjectInput in) throws IOException,
ClassNotFoundException {
_cents = in.readInt( );
_stringifiedRepresentation = _cents + " cents.";
}
 
public void writeExternal(ObjectOutput out) throws IOException {
out.writeInt(_cents);
}
}

ReallyEfficientMoneyhas much better performance than either Moneyor EfficientMoneywhen a single instance is serialized but is almost identical to EfficientMoneyfor large data sets. Again, averaging over 10 iterations, I record the numbers in Table 10-2.

Table 10-2: Testing ReallyEfficientMoney

Class

Number of instances

File size

Elapsed time

ReallyEfficientMoney

1

74 bytes

20 milliseconds

ReallyEfficientMoney

10,000

127 KB

927 milliseconds

Compared to Money, this is quite impressive; I've shaved almost 200 bytes of bandwidth and saved 40 milliseconds for the typical remote method call. The downside is that I've had to abandon my object hierarchy completely to do so; a significant percentage of the savings resulted from not including ValueObjectin the inheritance chain. Removing superclasses makes code harder to maintain and forces programmers to implement the same method many times ( ReallyEfficientMoneycan't use ValueObject's implementation of equals( )and hashCode( )anymore). But it does lead to significant performance improvements.

Related Reading

Java RMIJava RMI
By William Grosso
Table of Contents
Index
Sample Chapter
Full Description

One Final Point

An important point is that you can decide whether to implement Externalizableor Serializableon a class-by-class basis. Within the same application, some of your classes can be Serializable, and some can be Externalizable. This makes it easy to evolve your application in response to actual performance data and shifting requirements. The following two-part strategy is often quite nice:

  • Make all your classes implement Serializable.
  • After that, make some of them, the ones you send often and for which serialization is dramatically inefficient, implement Externalizableinstead.

This gets you most of the convenience of serialization and lets you use Externalizableto optimize when appropriate.

Experience has shown that, over time, more and more objects will gradually come to directly extend Objectand implement Externalizable. But that's fine. It simply means that the code was incrementally improved in response to performance problems when the application was deployed.


View catalog information for Java RMI


Related articles:

Learning Command Objects and RMI -- O'Reilly's Java RMI author William Grosso introduces you to the basic ideas behind command objects by providing a translation service from a remote server and using command objects to structure the RMI made from a client program.

Seamlessly Caching Stubs for Improved Performance -- In Part 2 of this RMI series, William Grosso addresses a common problem with RMI apps -- too many remote method calls to a naming service. In this article he extends the framework introduced in Part 1 to provide seamless caching of stubs.

Generics and Method Objects -- O'Reilly's Java RMI author William Grosso introduces you to the new Generics Specification and rebuilds his command object framework using it.


Return to ONJava.com.