I am working on a project that is using a spreadsheet with a very complex mathematical model.
from Apache official document, it listed 302 supported functions as of July 2021, however, if we call WorkbookEvaluator.getSupportedFunctionNames(), there are only 202. why there is 100 difference? especially these statistic functions such as NORMDIST, NORMINV, etc?
if there is a lib that includes the missed 100 functions where can I find them?
CodePudding user response:
Apache POI's documentation is a little bit unclear in this case. So I will try to shed more light.
At first: All the following is state of 2022-01-26, Apache poi 5.2.0.
Excel functions which are implemented or only known by name
Developing Formula Evaluation -> What functions are supported? tells:
As of July 2021, POI supports 302 built-in functions, see Appendix A for the full list.
But that's not really true. It better should read: "POI knows 393 built-in functions by name.", because not all those functions are implemented yet.
// list of functions that POI can evaluate
java.util.Collection<String> supportedFuncs = org.apache.poi.ss.formula.WorkbookEvaluator.getSupportedFunctionNames();
System.out.println("Following functions are implemented by Apache POI:");
System.out.println("Count: " supportedFuncs.size());
System.out.println(supportedFuncs);
lists 202 functions which are implemented.
// list of functions that are known but not supported by POI
System.out.println("Following functions are known by name by Apache POI but not implemented yet:");
java.util.Collection<String> unsupportedFuncs = org.apache.poi.ss.formula.WorkbookEvaluator.getNotSupportedFunctionNames();
System.out.println("Count: " unsupportedFuncs.size());
System.out.println(unsupportedFuncs);
lists 191 functions that are known by name but not implemented yet.
These 191 functions lacks the Java implementation code. They are known by name but WorkbookEvaluator will not be able to evaluate them. So org.apache.poi.ss.formula.eval.NotImplementedFunctionException will be thrown.
How to implement known but not implemented functions?
The Java code can be implemented as described in Developing Formula Evaluation. But there is a further hurdle.
Developing Formula Evaluation -> Two base interfaces to start your implementation:
All Excel formula function classes implement either the
org.apache.poi.hssf.record.formula.functions.Functionor theorg.apache.poi.hssf.record.formula.functions.FreeRefFunctioninterface.Functionis a common interface for the functions defined in the Binary Excel File Format (BIFF8): these are "classic" Excel functions like SUM, COUNT, LOOKUP, etc.FreeRefFunctionis a common interface for the functions from the Excel Analysis ToolPak, for User Defined Functions that you create, and for Excel built-in functions that have been defined since BIFF8 was defined.
So for the implementation one needs to know which of these 191 functions needs implementing org.apache.poi.hssf.record.formula.functions.Function and which of them needs implementing org.apache.poi.hssf.record.formula.functions.FreeRefFunction .
// list of ATP functions that are known but not supported by POI
System.out.println("Following functions are ATP functions known by name by Apache POI but not implemented yet:");
java.util.Collection<String> unsupportedFuncsATP = org.apache.poi.ss.formula.atp.AnalysisToolPak.getNotSupportedFunctionNames();
System.out.println("Count: " unsupportedFuncsATP.size());
System.out.println(unsupportedFuncsATP);
lists 83 functions that are AnalysisToolPak functions known by name but not implemented yet. For those the implementation needs implementing org.apache.poi.hssf.record.formula.functions.FreeRefFunction.
// list of BIFF8 functions that are known but not supported by POI
System.out.println("Following functions are BIFF8 functions known by name by Apache POI but not implemented yet:");
java.util.List<String> unsupportedFuncsBIFF8 = unsupportedFuncs.stream().filter(n -> !unsupportedFuncsATP.contains(n)).collect(java.util.stream.Collectors.toList());
System.out.println("Count: " unsupportedFuncsBIFF8.size());
System.out.println(unsupportedFuncsBIFF8);
lists 108 functions that are "classic" Excel functions known by name but not implemented yet. For those the implementation needs implementing org.apache.poi.hssf.record.formula.functions.Function.
What about Excel functions outside those lists?
Excel functions which neither are implemented nor known by name must be implemented as user defined functions. This is described in User Defined Functions. This is basically to implement user defined functions, which are in VBA or in Add-Ins. But it also can be used to implement new Excel functions which not even known by name in Apache POI. I have shown an example for implementing NUMBERVALUE function.
Why the need to implement Java function code at all?
Apache POI's approach is handling Microsoft Office files without the need of an installed Microsoft Office application. So Apache POI cannot access the implemented Excel functions because there is no Excel application. So all needed functions need to be implemented by own code.
Main problem here is that the source code of the single Excel functions is closed source and so not public available. Only possibility is to implement the code according to the published function descriptions. That often leads to discrepancies between the results of Java-implemented function code and Excel function code. The problem increases in border-line use cases of the functions. For example if parameters are omitted or if parameters are set using not expected types. So don't be surprised if the result of the formula evaluation of Apache POI differs from the one of Microsoft Excel.
