In Apache Pig, a Eval Function is a class which extends EvalFunc, rather than a function, so we can't leverage java's polymorphism for function. But there are 2 back doors left/designed by EvalFunc designer:
1. The input of EvalFunc is a tuple which makes Input Polymorphism possible.
2. EvalFunc is generic which makes Output Polymorphism possible.
Input Polymorphism
Input Polymorphism is referring to the variance of input.As the input of EvalFunc is a tuple, and the element of tuple is object which means you can embed any object to tuple and pass to EvalFunc.
For example:
public class Add extends EvalFunc<Double> {
@Override
public Double exec(Tuple input) throws IOException {
Object a = input.get(0);
Object b = input.get(1);
Double da, db;
if(a instanceof String){
da = Double.parseDouble(a);
}
else{
da = a;
}
if(b instanceof String){
db = Double.parseDouble(b);
}
else{
db = b;
}
return da+db;
}
}
In the previous example, the Add function tries to parse a string into double so that add between strings or between string and double is ok.
Output Polymorphism
Output Polymorphism is referring to the variance of output.Usually you have to designate the output type of Eval Function. In the example above, Double is the return type. But if you want the return type to vary, you could just use Object as the return type.
For example:
public class AorB extends EvalFunc<Object> {
@Override
public Object exec(Tuple input) throws IOException {
Object a = input.get(0);
Object b = input.get(1);
if(a != null){
return a;
}
else{
return b;
}
}
}
In the example above, AorB returns a if a is not null or b otherwise.
Of course, the combination of input and output polymorphism make Eval Function more flexible and powerful.
 
No comments:
Post a Comment