I was thinking to improve my java fundamentals, so I planned to make few notes for myself that will assist me in future (Obviously at the time of Interviews). Following questions and answers have been taken from stackoverflow.com or such other sites. This post is a sticky note post for me. Please do not use the answers for reference purpose.
I have heard lot of Serialization, but
do I really know the real purpose of Serialization? When someone tells me that
serialization is required to save data in DBs and files, then it confuses my
small brain. Did I ever serialize a class explicitly? Let’s find some answers
to our question.
What
do you mean by Serialization?
Serialization is the conversion of an object to a
series of bytes, so that the object can be easily saved to persistent storage
or streamed across a communication link. The byte stream can then be
deserialised - converted into a replica of the original object.
Objects to be stored and retrieved
frequently refer to other objects. Those other objects must be stored and
retrieved at the same time to maintain the relationships between the objects.
When an object is stored, all of the objects that are reachable from that
object are stored as well.
The
goals for serializing JavaTM objects are to:
- Have a simple yet extensible mechanism.
- Maintain the JavaTM object type and safety properties in the serialized form.
- Be extensible to support marshaling and unmarshaling as needed for remote objects.
- Be extensible to support simple persistence of JavaTM objects.
- Require per class implementation only for customization.
- Allow the object to define its external format.
Understanding
Serialization with an analogy
After hard work of many years,
Earth's scientist developed a robot who can help them in daily work. But this
robot was less featured than the robots which were developed by the scientist
of Mars planet.
After a meeting
between both planet's scientist, it is decided that mars will send their robots
to earth. But a problem occurred. The cost of sending 100 robots to earth was
$100 millions. And it takes around 60 days for traveling.
Finally, Mar's
scientist decided to share their secret with Earth's scientists. This secret
was about the structure of class/robot. Earth's scientists developed the same
structure on earth itself. Mar's scientistsserialized the data of
each robot and send it to earth. Earth's scientists deserialized the data and fed it into each robot accordingly.
This process
saved the time in communicating mass amount of data.
Some of the
robots were being used in some defensive work on Mars. So their scientists
marked some crucial properties of those robots as transient before sending their data to Earth. Note that transient
property is set to null(in case of reference) or to default value(in case of
primitive type) when the object gets deserialized.
One more point
which was noticed by Earth's scientists is that Mars's scientists ask them to
create some static variables to keep environmental detail. This detail is used by some
robots. But Mars's scientists dint share this detail because the environment on
earth was different from Mars.
Understand Serialization with Real
life example
- Banking example: When the account holder tries to withdraw money from the server through ATM, the account holder information along with the withdrawl details will be serialized (marshalled/flattened to bytes) and sent to server where the details are deserialized (unmarshalled/rebuilt the bytes)and used to perform operations. This will reduce the network calls as we are serializing the whole object and sending to server and further request for information from client is not needed by the server.
- Stock
example: Lets say an user wants the
stock updates immediately when he request for it. To achieve this,
everytime we have an update, we can serialize it and save it in a file.
When user requests the information, deserialize it from file and provide
the information. This way we dont need to make the user wait for the information
until we hit the database, perform computations and get the result.
Why should Serialization
· To persist data for
future use. It is possible to persist data in DB server, file system or other
mean which can hold it forever.
·
To send data to a
remote computer using such client/server Java technologies as RMI or socket
programming.
- To
"flatten" an object into array of bytes in memory.
- To
exchange data between applets and servlets.
- To
store user session in Web applications.
- To
activate/passivate enterprise java beans.
- To
send objects between the servers in a cluster.
Terms to understand serialization
Static fields
If you
are a java programmer, static fields does not require any explanation here. You
must remember that static fields are associated with the class instead the
object, so while serializing the state of the object, these fields are not
serialized with object data.
The serialization runtime associates with each serializable class
a version number, called a serialVersionUID, which is used during
deserialization to verify that the sender and receiver of a serialized object
have loaded classes for that object that are compatible with respect to
serialization. If the receiver has loaded a class for the object that has a
different serialVersionUID than that of the corresponding sender's class, then
deserialization will result in an InvalidClassException. A serializable
class can declare its own serialVersionUID explicitly by declaring a field
named "serialVersionUID" that must be static, final, and of
type long:
ANY-ACCESS-MODIFIER static final long serialVersionUID
= 42L;
If a serializable class does not explicitly declare a
serialVersionUID, then the serialization runtime will calculate a default
serialVersionUID value for that class based on various aspects of the class, as
described in the Java(TM) Object Serialization Specification. However, it
is strongly recommended that all serializable classes explicitly
declare serialVersionUID values, since the default serialVersionUID computation
is highly sensitive to class details that may vary depending on compiler
implementations, and can thus result in unexpected InvalidClassExceptions during
deserialization. Therefore, to guarantee a consistent serialVersionUID value
across different java compiler implementations, a serializable class must
declare an explicit serialVersionUID value. It is also strongly advised that
explicit serialVersionUID declarations use the private modifier where possible,
since such declarations apply only to the immediately declaring
class--serialVersionUID fields are not useful as inherited members.
So serialVersionUID is used for following 2 scenarios
- SerialVersionUID acts as meta-data for Serialization and Deserialization process.
- SerialVersionUID is changed to reject unused serialized objects in future releases.
Variables
may be marked transient to indicate that they are not part of the
persistent state of an object.
The basic
mechanism of Java serialization is simple to use, but there are some more
things to know. As mentioned before, only objects marked Serializable can
be persisted. The java.lang.Object class does not implement that
interface. Therefore, not all the objects in Java can be persisted
automatically. The good news is that most of them -- like AWT and Swing GUI
components, strings, and arrays -- are serializable.
On the
other hand, certain system-level classes such
as Thread, OutputStream and its subclasses,
and Socket are not serializable. Indeed, it would not make any sense
if they were. For example, thread running in my JVM would be using my system's
memory. Persisting it and trying to run it in your JVM would make no sense at
all. Another important point about java.lang.Object not implementing
the Serializable interface is that any class you create that extends
only Object (and no other serializable classes) is not serializable
unless you implement the interface yourself (as done with the previous example).
That
situation presents a problem: what if we have a class that contains an instance
of Thread? In that case, can we ever persist objects of that type? The
answer is yes, as long as we tell the serialization mechanism our intentions by
marking our class's Threadobject as transient.
Another reason to use transient is when your class
does some kind of internal caching. If, for example, your class can do
calculations and for performance reasons it caches the result of each
calculation, then saving that cache might not be desirable (because recalculating
it might be faster than restoring it, or because it's unlikely that old cached
values are of any use). In this case you'd mark the caching fields as transient.
Let’s serialize a class
Create OutputObjectStream by passing applicable OutputStream. E.g. to create File, one of the application output stream is FileOutputStream.
Let’s serialize a class
Java has provided 2 ways to serialize a class.
Developer is allowed to use either default mechanism provided by java by
implementing java.io.Serializable interface or custom mechanism by implementing
java.io. java.io.Externalizable interface. Serializable is a marker interface/tag
interface which defines a contract of serializability and suggests VM to
serialize class whereas Externalizable contains two methods which should be
implemented to serialize the object.
Let’s serialize the
class with default implementation.
Create a class which implements java.io.Serializable interface.
Create a class which implements java.io.Serializable interface.
List 1 import java.io.Serializable; import java.util.Date; public class Person implements Serializable { private static final long serialVersionUID = -8708626244864641161L; String name; Date dob; public String getName() { return name; } public void setName(String name) { this.name = name; } public Date getDob() { return dob; } public void setDob(Date dob) { this.dob = dob; } }
Create OutputObjectStream by passing applicable OutputStream. E.g. to create File, one of the application output stream is FileOutputStream.
List 2 FileOutputStream fos = new FileOutputStream(new File("PersonDetail.ser")); ObjectOutputStream oos = new ObjectOutputStream(fos);
Call writeObject method of ObjectOutputStream with the instance of the object you want to serialize.
List 3 outStream.writeObject(personInstance);
Let's club last 2 steps into one program.
List 4 import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import java.io.ObjectOutputStream; import java.util.Date; public class DemoSerialization { public static void main(String[] args) throws IOException { Person per = new Person(); per.setName("Vishal Jain"); per.setDob(new Date(1971, 1, 1));//Let's forget that it is deprecated method. FileOutputStream fos = new FileOutputStream(new File("Person.ser")); ObjectOutputStream oos = new ObjectOutputStream(fos); oos.writeObject(per); } }
On running this program, it will create a file Person.ser in current folder which contains content of the Object Person.
Let’s de-serialize the class with
default implementation.
Create InputObjectStream by passing applicable InputStream. E.g. to create File, one of the application input stream is FileInputStream.
List 5 FileInputStream fis = new FileInputStream(new File("PersonDetail.ser")); ObjectInputStream ois = new ObjectInputStream(fos);
Call readObject method of ObjectInputStream with the instance of the object you want to serialize.
List 6 (CAST_TO_PERSON)outStream.readObject();
Let's club last 2 steps into one program.
List 7 public class DemoSerialization { public static void main(String[] args) throws IOException, ClassNotFoundException { FileInputStream fis = new FileInputStream(new File("Person.ser")); ObjectInputStream ois = new ObjectInputStream(fis); Person per = (Person)ois.readObject(); System.out.println("Name:" + per.getName() + " DOB:" + per.getDob()); } }
Output
for the following program as follows:
List 8 Name:Vishal Jain DOB:Wed Feb 01 00:00:00 IST 3871
Let’s check the importance of serialVersionUID
Our application has person class with serialVersionUID, and client contains a persisted object. Another developer did not know the importance of serialVersionUID, so he changes the version UID as following.
List 9 import java.io.Serializable; import java.util.Date; public class Person implements Serializable { //Changing -8708626244864641161L to 8708626244864641161L. private static final long serialVersionUID = 8708626244864641161L; String name; Date dob; public String getName() { return name; } public void setName(String name) { this.name = name; } public Date getDob() { return dob; } public void setDob(Date dob) { this.dob = dob; } }
On
re-running the list 6 again with new Person (as per the list 7), following
exception is encountered.
List 10 Exception in thread "main" java.io.InvalidClassException: Person; local class incompatible: stream classdesc serialVersionUID = -8708626244864641161, local class serialVersionUID = 8708626244864641161 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:562) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1582) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1731) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at DemoSerialization.main(DemoSerialization.java:12)
Till now it looks that the serialization is quite simple but is it that
simple as demonstrated till now? So let’s understand it with various cases.
When an object has a reference to
other objects
In general all the classes are not simple to have only primitives. Most
of the classes contains object of another Java class. So what happens when
application tries to serialize such classes?
Let’s introduce Person’s address. To add Person’s address, application
need to have Address class in the classpath.
List 11 package com.serial; public class Address { int houseNo; String city; int pincode; public int getHouseNo() { return houseNo; } public void setHouseNo(int houseNo) { this.houseNo = houseNo; } public String getCity() { return city; } public void setCity(String city) { this.city = city; } public int getPincode() { return pincode; } public void setPincode(int pincode) { this.pincode = pincode; } public Address(int houseNo, String city, int pincode) { this.houseNo = houseNo; this.city = city; this.pincode = pincode; } }
Let’s change the definition of Person class
List 12 package com.serial; import java.io.Serializable; public class Person implements Serializable { private static final long serialVersionUID = 100L; String name; Address office; Address home; public String getName() { return name; } public void setName(String name) { this.name = name; } public Address getOffice() { return office; } public void setOffice(Address office) { this.office = office; } public Address getHome() { return home; } public void setHome(Address home) { this.home = home; } }
Okay,
stage is set to serialize the new version of Person class. Let’s serialize.
List 13 public class DemoSerialization { public static void main(String[] args) throws IOException { Person per = new Person(); per.setName("Vishal Jain"); Address home = new Address(100, "Noida", 201300); Address office = new Address(100, "Delhi", 101300); per.setHome(home); per.setOffice(office); FileOutputStream fos = new FileOutputStream(new File("Person.ser")); ObjectOutputStream oos = new ObjectOutputStream(fos); oos.writeObject(oos); } }
After
running the program, application threw exception mentioned below.
List 14 Exception in thread "main" java.io.NotSerializableException: com.serial.Address at java.io.ObjectOutputStream.writeObject0(Unknown Source) at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source) at java.io.ObjectOutputStream.writeSerialData(Unknown Source) at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source) at java.io.ObjectOutputStream.wr iteObject0(Unknown Source) at java.io.ObjectOutputStream.writeObject(Unknown Source) at com.serial.DemoSerialization.main(DemoSerialization.java:16)
Oops, it was not expected. Is not it? As java says, to serialize a
class it is required to implement java.io.Serializable interface, but address
class do not implement this interface.
In the first version of Person class, String, still a java class, was
used. Why did not application throw any exception? The answer is clear. String
class implements java.io.Serializable, Comparable<String>, CharSequence
interfaces, so it is possible to serialize classes having String fields.
Let’s implement the interface in Address class and rerun the program.
List 15 package com.serial; import java.io.Serializable; public class Address implements Serializable { private static final long serialVersionUID = 1000L; int houseNo; String city; int pincode; public int getHouseNo() { return houseNo; } public void setHouseNo(int houseNo) { this.houseNo = houseNo; } public String getCity() { return city; } public void setCity(String city) { this.city = city; } public int getPincode() { return pincode; } public void setPincode(int pincode) { this.pincode = pincode; } public Address(int houseNo, String city, int pincode) { this.houseNo = houseNo; this.city = city; this.pincode = pincode; } }
After running the
program, Person.ser file is created with serialized object.
Now let’s try to
deserialize the content with the following code
List 16 public static void main(String[] args) throws IOException, ClassNotFoundException { FileInputStream fis = new FileInputStream(new File("Person.ser")); ObjectInputStream ois = new ObjectInputStream(fis); Person per = (Person)ois.readObject(); System.out.println("Name:" + per.getName() + " Home City:" + per.getHome().getCity() + " Office City:" + per.getOffice().getCity()); } }
Result is as followed:
List 17 Name:Vishal Jain Home City:Noida Office City:Delhi
Impact of Inheritance on Serialization
When a super class is
implementing the Serializable interface, then all the classes extending the
specific super class is serializable.
Subclass is serializable but superclass is not
Let’s change person
class such that it does not implement Serializable interface anymore, and let’s
define a constructor of the class, so that default construction is not allowed
anymore.
List 18 public class Person{ . . . public Person(String name, Address office, Address home) { super(); this.name = name; this.office = office; this.home = home; } }
Create a subclass player
that extends Person class.
List 19 package com.serial; import java.io.Serializable; public class Player extends Person implements Serializable { private static final long serialVersionUID = -154L; String academy; public Player(String name, Address office, Address home, String aced) { super(name, office, home); academy = aced; } public String getAcademy() { return academy; } public void setAcademy(String academy) { this.academy = academy; } }
Serialize the Player
class as following
List 20 public class DemoSerialization { public static void main(String[] args) throws IOException { Address home = new Address(100, "Noida", 201300); Address office = new Address(100, "Delhi", 101300); Player per = new Player("Vishal Jain", home, office, "SRT Aceds"); FileOutputStream fos = new FileOutputStream(new File("Player.ser")); ObjectOutputStream oos = new ObjectOutputStream(fos); oos.writeObject(per); } }
It creates a file called
Player.ser which contains information about the primitives and fields that are
part of Player class.
Let’s try to deserialize
the object with the following code.
List 21 public class DemoDeSerialization { public static void main(String[] args) throws IOException, ClassNotFoundException { FileInputStream fis = new FileInputStream(new File("Player.ser")); ObjectInputStream ois = new ObjectInputStream(fis); Player per = (Player)ois.readObject(); System.out.println("Academy:" + per.getAcademy() ); } }
On executing the
application following exception is encountered.
List 22 Exception in thread "main" java.io.InvalidClassException: com.serial.Player; com.serial.Player; no valid constructor at java.io.ObjectStreamClass.checkDeserialize(Unknown Source) at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at com.serial.DemoDeSerialization.main(DemoDeSerialization.java:12)
Does not that mean that
we are missing something? Yes we are. So add a default constructor is Person
class and this program will work fine without any issues.
List 23 public class Person{ . . . public Person() { } }
It is quite clear that
when a superclass is not serialized then application do not serialize the
information about the class and at the time of Deserilizing the object is tries
to create the object with default constructor. If constructor is not found, it
throws the exception else creates default instance of super class.
Superclass is serializable but subclass should not be serializable
When superclass is
serializable, but application does not want to store classes extending the superclass.
In this case subclass
must implement readObject and writeObject methods and these methods must throw java.io.NotSerializableException.
List 24 public class Player extends Person { . . . private void writeObject(ObjectOutputStream os) throws IOException, ClassNotFoundException{ throw new NotSerializableException(); } private void readObject(ObjectInputStream is) throws IOException, ClassNotFoundException{ throw new NotSerializableException(); } }
On running the program
listed above, application throws following exception.
List 25 Exception in thread "main" java.io.NotSerializableException at com.serial.Player.writeObject(Player.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source) at java.io.ObjectOutputStream.writeSerialData(Unknown Source) at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source) at java.io.ObjectOutputStream.writeObject0(Unknown Source) at java.io.ObjectOutputStream.writeObject(Unknown Source) at com.serial.DemoSerialization.main(DemoSerialization.java:13)
Serialize when source code is not available
There are instances when
third party class/subclass must be serialized but vendor has not provided the
source code. Then what options do the coder has?
Don’t Serialize the class
In case of composition,
you can mark the variable/field transient, which will suggest compiler not to
serial the object.
List 26 transient Address office;
Okay, Application cannot live without the object, and it is required to
serialize
In this case, it is required to implement readObject and writeObject according to the requirement.
List 27 private void writeObject(ObjectOutputStream oos) throws IOException, ClassNotFoundException { try { oos.defaultWriteObject(); oos.writeInt(home.getHouseNo()); } catch (Exception e) { e.printStackTrace(); } } private void readObject(ObjectInputStream ois) throws IOException, ClassNotFoundException { try { ois.defaultReadObject(); home = new Address(ois.readInt(), "", 0); } catch (Exception e) { e.printStackTrace(); } }
Please maintain the same
order of reading and writing, else serialized object will differ from the
object de-serialized.
Enum constants are serialized differently than
ordinary serializable or externalizable objects. The serialized form of an enum
constant consists solely of its name; field values of the constant are not
present in the form. To serialize an enum constant, ObjectOutputStream writes
the value returned by the enum constant's name method. To deserialize
an enum constant, ObjectInputStream reads the constant name from the
stream; the deserialized constant is then obtained by calling the java.lang.Enum.valueOf method,
passing the constant's enum type along with the received constant name as
arguments. Like other serializable or externalizable objects, enum constants
can function as the targets of back references appearing subsequently in the
serialization stream.
The process by which enum constants are
serialized cannot be customized: any class-specific writeObject, readObject, readObjectNoData, writeReplace,
and readResolve methods defined by enum types are ignored during
serialization and deserialization. Similarly, any serialPersistentFields or serialVersionUID field
declarations are also ignored--all enum types have a fixed serialVersionUID of 0L.
Documenting serializable fields and data for enum types is unnecessary, since
there is no variation in the type of data sent.
When java has provided default and robust
implementation of Serialization, then why it mandated developers to implement
marker interface. Java could have provided serialization as default behavior of
class.
Best explanation for java’s such decision is
following
Serialization is fraught with pitfalls. Automatic
serialization support of this form makes the class internals part of the public
API (which is why javadoc gives you the persisted
forms of classes).
For long-term persistence, the class must be able
to decode this form, which restricts the changes you can make to class design.
This breaks encapsulation.
Serialization can also lead to security problems.
By being able to serialize any object it has a reference to, a class can access
data it would not normally be able to (by parsing the resultant byte data).
There are other issues, such as the serialized
form of inner classes not being well defined.
Making all classes serializable would exacerbate
these problems. Check out Effective Java Second Edition,
in particular Item 74: Implement Serializable judiciously.
Another reason, but it has it’s counter argument,
for not providing default serialization can be following
Not everything is genuinely serializable. Take a
network socket connection, for example. You could serialize the data/state of
your socket object, but the essence of an active connection would be lost.
Default serialization is somewhat verbose, and
assumes the widest possible usage scenario of the serialized object, and
accordingly the default format (Serializable) annotates the resultant stream
with information about the class of the serialized object.
Externalization give the producer of the object
stream complete control over the precise class meta-data (if any) beyond the
minimal required identification of the class (e.g. its name). This is clearly
desirable in certain situations, such as closed environments, where producer of
the object stream and its consumer (which reifies the object from the stream)
are matched, and additional metadata about the class serves no purpose and
degrades performance.
Additionally (as Uri point out) externalization
also provides for complete control over the encoding of the data in the stream
corresponding to Java types. For (a contrived) example, you may wish to record
boolean true as 'Y' and false as 'N'. Externalization allows you to do that.
If you want to read more about serialization, you
can browse following articles.
There are few more topics to cover on serialization, that we can take up sometime later.
Few of them are as following:
- Content of Serialized Objects.
- Performance of Serialized Objects.
No comments:
Post a Comment