Tech-articles: 2011

Wednesday, December 28, 2011

Linux process memory usage calculations

It is difficult to obtain the actual memory consumption of a process in Linux/Unix environment.We cannot always pin-point the exact memory usage of a process as most processes on Linux uses shared library.

Assume that we want to calculate memory usage for the 'ls' process. Do we count only the memory used by the executable 'ls' ? How about libc? Or all these other libs that are required to run 'ls' ? One could argue that they are shared by other processes, but 'ls' can't be run on the system without them being loaded.Also, if we need to know how much memory a process needs in order to do capacity planning, we would need to calculate how much each additional copy of the process uses

A close approximation of memory consumtion of a process is being provided by VSS(virtual Set Size) of the process.

We can obtain the VSS value from 'ps aux' command line. This is handy when we just need to view the process memory consumption once in a while.

However, the use of 'ps' command is not suitable when designing an application which reports system health. Imagine a health-check application which needs to report the memory consumtion of some resource intensive process running on the box. If the health-check app itself consumes 4-5% of memory as it needs to run say every 10sec then it adds an unnecessary overhead on the Linux/Unix host which is already resource starved. 'ps' command itself is taxing on resource memory utilisation.The same goes true for other commands e.g top, vmstat, etc

So to increase the efficiency of the program reporting the system health, we should avoid the use of 'ps' command. So what alternatives are available to obtain the VSS of a process?

VSS information can be obtained from /proc filesystem.VSS information is present in 23rd argument in /proc/$pid/stat. This is the same place from where ps command obtains its value. Its value can be obtained from /proc/$pid/status from VmSize value as well./proc/$pid/smaps also provide this information in detailed breakdown manner but is not worth the effort in most of the scenarios as the value of VSS is a close approximation to the actual memory being used by the system.

For the sake of calculating the %age memory consumtion by a process we can obtain the system memory consumtion from file /proc/meminfo. The argumenet MemTotal depicts the total memory consumption by the system.

The most efficient and easy way is to obtain it from a C/C++ program reading the /proc filesystem.C language is preferable on the consistency front as all Unix/Linux kernel supports C.

Thursday, December 1, 2011

Singletons in Java - An Analysis

The "classic" practical example...

import java.io.PrintStream;

public class LogManager {
 private static LogManager instance;
 private PrintStream pStream;

 private LogManager(PrintStream out) {
  pStream = out;
 }

 public static LogManager getInstance() {
  if (instance == null)
   instance = new LogManager(System.out);
  return instance;
 }

 public void log(String msg) {
  pStream.println(msg);
 }
}

Usage: LogManager.getInstance().log( "some message" );

Problem: It's possible in a naive implementation for one thread to preempt after the test for null but before the instance creation, so that the first thread (which has already tested for null) calls the constructor again and creates another instance.

Solution: Synchronize the getinstance method

public static synchronized LogManager getInstance() {
  if (instance == null)
   instance = new LogManager(System.out);
  return instance;
 }

Problem : Performance overhead because of the use of synchronized keyword in method signature

Solution : Just synchronize the instance creation rather than the entire method. Double-Checked Locking is widely cited and used as an efficient method for implementing lazy initialization in a multithreaded environment.

public static LogManager getInstance() {
  if (instance == null) {
   synchronized (LogManager.class) {
    if (instance == null)
     instance = new LogManager(System.out);
   }
  }
  return instance;
 }

Problem : It will not work reliably in a platform independent way when implemented in Java. The code just doesn't works in the presence of either optimizing compilers or shared memory multiprocessors. Lots of very smart people have spent lots of time looking at this. There is no way to make it work without requiring each thread that accesses the LogManager object to perform synchronization

Solution : Leave it as is. The cost of simply making the getInstance() method synchronized is not too high.

An Elegant Solution

Solution 1 : Nice way

import java.io.PrintStream;

public class LogManager {
 public static final LogManager instance = new LogManager(System.out);
 private PrintStream pStream;

 private LogManager(PrintStream out) {
  // To guard against Reflection creating the instance
  if (instance != null) {
   throw new IllegalStateException("instance already exists");
  }
  pStream = out;
 }

 public void log(String msg) {
  pStream.println(msg);
 }
}

Usage: LogManager.instance.log( "some message" );
Positives : much simpler with no performance overhead as compared to above methods as it doesn't uses the synchronized keyword.

Solution 2 : Better way

import java.io.PrintStream;

public class LogManager {
 private static final LogManager instance = new LogManager(System.out);
 private PrintStream pStream;

 private LogManager(PrintStream out) {
  // To guard against Reflection creating the instance
  if (instance != null) {
   throw new IllegalStateException("instance already exists");
  }
  pStream = out;
 }

 public static LogManager getInstance() {
  return instance;
 }

 public void log(String msg) {
  pStream.println(msg);
 }
}

Usage: LogManager.getInstance().log( "some message" );
Positives : In addition to Solution (1) advantages it gives you the flexibility to change your mind about whether the class should be a singleton without changing its API. The factory method returns the sole instance but could easily be modified to return, say, a unique instance for each thread that invokes it.

Serialization caveat :To make a singleton class that is implemented using either of the previous approaches serializable, it is not sufficient merely to add "implements Serializable" to its declaration. To maintain the singleton guarantee, you have to declare all instance fields transient and provide a readResolve method. Otherwise, each time a serialized instance is deserialized, a new instance will be created. To prevent this, add this readResolve method to the Singleton class:

// readResolve method to preserve singleton property
 private Object readResolve() {
  // Return the one true LogManager and let the garbage collector
  // take care of the LogManager impersonator.
  return instance;
 }

Solution 3 : Best way

public enum LogManager {
 INSTANCE;
 private java.io.PrintStream pStream = System.out;

 public void log(String msg) {
  pStream.println(msg);
 }
}

Usage: LogManager.INSTANCE.log( "some message" );
This approach is functionally equivalent to the public field approach, except that it is more concise, provides the serialization machinery for free, and provides an ironclad guarantee against multiple instantiation, even in the face of sophisticated serialization or reflection attacks. While this approach has yet to be widely adopted, a single-element enum type is the best way to implement a singleton.

References: http://c2.com/cgi/wiki?JavaSingleton
References: Effective java book

Tuesday, November 29, 2011

Java clone method - why to avoid

No programming language in this world is perfect and Java is no exception to it. There are quite a few bad design decisions that were made earlier and because of backward compatibility reason continue to exist even today.

One such bad design decision is related to clone method. At a first glance, it might seem like a good idea to use a method named clone to copy an object. But as we will see bellow, this is not really the case with Java’s Object.clone method.

Java has an interface called Cloneable. In principle, one should implement this interface if it is desired to make an object cloneable. The problem here is that this interface doesn’t define any methods. Instead, a clone method is defined in the Object class. The mess that it creates is implementing an interface changes the behavior of a method defined 'elsewhere'.

Moreover Object.clone is a protected method, so one must override it with a public method in order for it to be accessible – or else it should be called reflectively, but then it would get too complicated.

This is a bad design mainly because an interface should enforce you to implement some behavior in your class. Without implementing the proper method, your code shouldn’t even compile. But what Cloneable interface does is to say (in the javadoc documentation) that you should override Object.clone. Also, for the clonning to happen correctly, you would need to have your 'whole' class hierarchy 'overriding' clone. This means all super classes and all mutable objects referenced from all those classes. What about 3rd party class that doesn’t do so?

The simple way out is to consider the following two options instead of using clone method.
1. Use a copy constructor:
e.g. public MyClass(MyClass myClass) {
// initialize your fields here
}
OR
2. Create some utility method for copying the object:
public static MyClass newInstanceMyClass myClass) {
// create your object and return it here
}
The 'Effective Java' book has dealt with the topic in a more detailed way. Please refer to it for further study.

Wednesday, April 6, 2011

All about MLet

One of Java's great strengths as a software platform is its ability to dynamically load new classes. Applets and Servlets have been an integral part of Java and played a major role in Java success story. Their main capability is to load secure and platform independent code over the network at client side (applets) and server side (servlets). JMX takes advantage of this capability in management domain and the mechanism is known as Mlets.

The M-Let (short for management applet) service is a JMX agent service that allows you to load MBeans from anywhere on the network, including a local machine. The M-Let service is itself an MBean and can be managed as such. Information about MBeans to be loaded is contained in a text file called an M-Let file. This file has an XML-like syntax, but the syntax does not constitute well-formed XML. Using special tags called M-Let tags, we can encode enough information in the M-Let file that the M-Let service can locate, download the bytecode for, and instantiate MBeans.

We can also use the M-Let service, in conjunction with the MBean server, to load MBeans without the use of an M-Let file. We simply add the file's URL to the M-Let service's list of URLs that it will search when attempting to load MBeans, then call a method on the MBean server and pass the M-Let service's object name (which we created when we registered the M-Let service MBean with the MBean server) as the MBean class loader.

Architecture

Following diagram provides a brief overview of JMX architecture.

This is how MLet fits into the overall picture.

In progress.................