Sunday, April 29, 2007

Lazy Evaluation and Caching...

... are the most widely used techniques for improving your software's implementation performance.

Then what are they and how can I use them?

Lazy Evaluation is exactly just like a Lazy Employee; he wont do the job unless some one directly requests it from him. Infact, he'll postpone it forever until the request is made official.

Caching on the other hand is simply storing frequently used data in some quick array or memory for fast retreival, although the same data is stored in a database or in a file, the programmer usually chooses to cache it in its accessible form for fast retreival. This memory could be an Array, a Berkley DB file, a tmp file or any other suitable media.

Google - a well known search engine - caches search results for most frequently used queries. FileNET (a really good CMS server) renders all content objects inside one single HTML page ready for access (instead of querying it from a database everytime). The HTTP protocol itself caches different requested pages on the internet on local internet proxies and implements the HTTP headers to do so. And even does your browser cache some html pages from time to time.

Thats why you usually need to press Ctrl+F5 to force requesting to refresh the page from the server.

As a programming example, lets say you have a small class named UserAccountManager and this class contains a single method:

public static String UserAccountsManager.getUsername(int UserId);

And a single private array which is simply a HashMap mapping a UserID to the Username which is used by getUsername(int UserId).

The method is implemented simply to execute a query on the database:

"SELECT USERNAME FROM USERS WHERE ID=" + UserId

Here is a simple way to see how caching and lazy evaluation can be applied.

The program can implement caching using a private array map that maps the user id to the username string. This array is filled only once from the database when the program first loads.

The tricky part when implementing caching is knowing when to re-generate the cache, or more accurately, how to identify that the original data store has changed or not? If changed, then the cache needs to be regenerated or else your program is using out-of-date data.
Any calls to getUsername will simply access the array instead of accessing the database. This is whats Caching all about.

Now, lazy evaluation simply changes the implementation so that the array AINT FILLED AT ALL until someone actually calls the getUsername() method.

The advantage here is that you dont waste memory or connect to the database until someone actually calls the function. (ps. It might need not be called at all).

After all this explanation, there is a very annoying advice that you have to stick to:

"Never implement optimization techniques such as caching or lazy evaluation until a bottle neck is identified in the system using benchmarking and testing procedures that shows that need. This simply means, avoid Premature Optimization as much as you can."