Wednesday, 21 August 2013

Hadoop - Manage Big Data and brings power to the Enterprise Solutions






What Is Hadoop?

Apache Hadoop is a free and open source implementation of frameworks for reliable, scalable, distributed computing and data storage. It enables applications to work with thousands of nodes and petabytes of data as such is a great tool for research and business operations. Hadoop was inspired by Google's MapReduce Google File System (GFS) papers.

Its being increasingly common to have data sets that are to large to be handled by traditional database or by any technique that runs on a single computer or even a small cluster of computers. In the age of Big-Data, Hadoop has evolved as the library of choice for handling it.

Why Hadoop is needed? 

Companies continue to generate large amounts of data. Here are some statistics of 2011.
     -- Facebook : 6 billion messages per day.
     -- EBay : 9 petabytes of storage per day.
Existing tools were not designed to handle such large amount of Big Data.

How Hadoop Stores Data?

In Hadoop data can be stored very easily and processed information occupies very little space, any legacy system or big system can store data for a long time very easily and with minimum costs. The way Hadoop is built is very interesting. each part was created for something big starting with file storage to processing and scalability. One of the most important and interesting components that Hadoop has is the file storage system - Hadoop Distributed File System (HDFS). Generally when we talk about general storage system with high capacity we think of custom hardware which is extremely costly (price and maintenance). HDFS is a system which doesn't require special hardware. It runs smoothly on normal configuration and can be used together with our home and office computers.



Who Uses Hadoop?



Monday, 20 May 2013

Server Side Validation - Ajax


Let’s take a scenario where server side validation is required to check whether employee is already in a project or not.

Here I'll explain implementation of avove requirement using Ajax in a Struts 2 application but the concept will be same for any other framework.
Step 1: Using JQuery invoke ajax and based on return value display the proper message to user.


$(document).ready(function(){
         $("#associateID").blur(function(){
                var assId =  $("#associateID");
               
   //checkAssociateId function returns true if employee is already available in DB  
        
               if(checkAssociateId(assId.val())){
                          // Display message and set focus on Associate ID field
                          $('#spnassociateID').html('Associate is already On-boarded'); 
                          $('#spnassociateID').css('color', 'red'); 
                          $('#spnassociateID').css( 'font-size', 'small'); 
                          $("#associateID").focus();
                          return false;
                } else{                      
                         
                          $('#spnassociateID').html('ID is accepted and not available in project data base'); 
                          $('#spnassociateID').css('color', 'green'); 
                          $('#spnassociateID').css( 'font-size', 'small'); 
                           return true;                         
                }          
         })
       })

function
checkAssociateId(id){
        //checkAssociateAjaxCall returns "not-available" if associate id is not present in employee_details
       var
result = checkAssociateAjaxCall(id);
       if
(result.indexOf("not-available")>=0){
            return
false;
       }else
{
            return
true;
   }
}

// checkAssociateAjaxCall actually do the Ajax call
function
checkAssociateAjaxCall(id)
{
// Ajax implementation
    var
result;
    var
xmlhttp;
// random parameter is added to make each request unique other wise cache will automatically take previous result without hitting DB
    var
strURL = "validateAssociateId?associateId="+ id + "&random=" + Math.random(); ;
  if
(window.XMLHttpRequest)
   {
// code for IE7+, Firefox, Chrome, Opera, Safari
          xmlhttp=
new XMLHttpRequest();
   }
   else
 {
// code for IE6, IE5
        xmlhttp=
new ActiveXObject("Microsoft.XMLHTTP");
  }
        xmlhttp.onreadystatechange=
function()
{
if (xmlhttp.readyState==4 && xmlhttp.status==200)
{
      result = xmlhttp.responseText;
}
}
   xmlhttp.open(
"GET",strURL,false);
  xmlhttp.send();
  return result;
}

Step 2: In Above code calling "validateAssociateId" action thus add below piece of code in struts configure xml (struts.xml).

<
action name="validateAssociateId"
class="com.assetman.action.OnboardEmployeeAction" method="validateAssociateId" >
           <result type="stream">
             <param name="contentType">text/html</param>
             <param name="inputName">inputStream</param>
          </result></action>
Step 3:  In OnboardEmployeeAction struts action class define followings

private static String BLANK ="";// declare a inputStream and gette setter for in
private InputStream inputStream;
public InputStream getInputStream() {
return inputStream;
}
Step 4: In OnboardEmployeeAction struts action class define validateAssociateId method

@SuppressWarnings("deprecation")
public String validateAssociateId() throws Exception{

   String result=
BLANK;
// ValidateTask calls validateAssociateId method which returns either "success" or "error"
// PLease implement back end code as per your requirement
  ValidateTask validateTask =  new ValidateTask();
  result = validateTask.validateAssociateId( associateId);
if(result.equals(SUCCESS)){
// inputStream = new ByteArrayInputStream("available".getBytes("UTF-8"));
inputStream = new StringBufferInputStream("available");
}
else{
// inputStream =new ByteArrayInputStream("not-available".getBytes("UTF-8"));
inputStream = new StringBufferInputStream("not-available");
}
return SUCCESS;
}

Friday, 1 February 2013

Java wait, sleep, yield and notify

Learn about wait, sleep and yield in java:




Java wait and notify Example:
  • Wait and notify both are the methods of Object class.
  • Wait makes the object to wait for the given time or till it's notified.
  • Once wait is invoked, object lock is being released and other thread can work on object.
  • Both methods are called in the synchronized context.
Code Example:

public class WaitForNotification{
  public static void main(String []args) throws InterruptedException{
     Passenger psg = new Passenger();
     psg.start();
     // In synchronized block call wait method on newly created Passenger object
     Synchronized(psg){
       System.out.println(" Passenger is waiting for the notification.");
       psg.wait();
       System.out.println(" Passenger got notified.");
     }
   }
 }

class Passenger extends Thread{
  public void run(){
    Syncronized(this){
       System.out.println(" Wait...........");
       // Here add code befor bus gets notified
       System.out.println("Passenger is given notification call");
       notify()
    }
  }
}

Thursday, 31 January 2013

Application Performance

Its impressive when customer realizes thousands of users are browsing still application is behaving fast. Performance is important and can be achieved if we focus on below topics. Having said that there could be multiple other factors depending on application architecture.

· Adhere best code practices
· Improve performance of database operation
· J2EE application server performance tuning

 Adhere best code practices:

1. Use proven frameworks like Struts, Spring, Hibernate, JSF etc

2. Apply appropriate proven design patterns to improve performance and minimize network communication cost (Use session facade pattern, value object pattern).

3. Handle and propagate exception properly. Managing and logging exception details help to analyze production issues same time avoids System.out rather use Log4J for logging.

4. Implement caching methodology whenever applicable. This is an excellent technique to avoid multiple expensive database call. Let me explain this with a nice example.
 Application uses thousand organizations which stored in database. To display organization data (like org name, address, phone) or generate report require multiple database interaction for fetching organization details. Even situation will be worse when hundreds of users accessing same organization but actually application is hitting database several times. In this scenario caching is very cost effective.


5. Never use hard coded value in code. Principle should be writing once and reuse everywhere and use java object orientation technique. Class, method and variable name should be self-explanatory. Avoid creating unnecessary too many new objects. Code should be properly indented to make it more readable. One method should not be more than 100 lines.

 6. Session persistence is costly and never use until and unless it’s very much required. Try to store object in session as minimum as possible. Always remove the object you are storing in session when you do not need them anymore otherwise it will create unnecessary load on the server.

 7. If you are using EJB then follow EJB best practices. In addition an EJB call is expensive and always comes at price. Check to see if you can consolidate several EJB methods into a single coarse-grained EJB method.


Improve performance of database operation:


1. Database connection should be release when not needed anymore. Otherwise there will be potential resource leakage problems. Always put resource clean up code (i.e. database connection, statement, etc.) in a finally{} block.

2. Use JDBC prepared statement for repeated read and batching for repetitive inserts and updates.

3. Always try to use parameterized hql query whenever possible rather than appending
parameters to hql. Persistence provider will be able to cache / better use the query resulting better query performance. 


4. Try to avoid using functions (like UPPER, distinct, max etc.) in the query. Functions can only be used when it is most needed. Consult with our DBA for any kind of assistance.

5. For tables having more than 20 columns, do not use "select *" in the query.
Select only those fields you need.



 
J2EE application server performance tuning:


1. Set the Web container threads, which will be used to process incoming HTTP requests. The minimum size should be tuned to handle handle average load of the container and maximum should be tuned to handle peak load. The maximum size should be less than or equal to the number of threads in your Web server.

2. Application servers maintain a pool of JDBC resources so that new connection does not need to be created for each transaction. Application server can also cache your prepared statements to improve performance. So you can tune minimum and maximum size of these pools.

3. Tune your initial heap size for the JVM so that garbage collector runs at a suitable interval so that it does not cause any unnecessary overhead.

4. Set the session manager settings appropriately based on following guidelines.
  • Set the appropriate value for in memory session count.
  • Try to store object in session as minimum as possible.
  • Don't enable session persistance unless required by your application.
  • Invalidate your session when you are finished with them by setting appropriate session timeout.
5. Turn the application server tracing off unless required for debugging.

6. Some application servers support lazy loading and dirty marker strategies with EJB to improve performance.


[ to be continued .................]