In this article, I will discuss some of the issues in doing financial calculations in JAVA.
The issues are not related to JAVA only, but to any calculation done in a “computer language” using binary floating point arithmetic, including calculations done in spreadsheets like Excel (see References).
In JAVA, the issues can be solved by using the
java.math.BigDecimal type, that has been vastly improved in JAVA 5. But compared to similar implementations using the native
double type, the code is clumsy and performance suffers.
Floating point calculations
The most natural data type to use when doing calculations with non-integer numbers, is to use the
double data type.
This type (and the corresponding
float type) is optimized for multiplication and division, making it an ideal choice for scientific, statistical and engineering application, where multiplication, division, involution and factorials are predominant.
The data type is implemented in accordance with the IEEE 754 standard (double precision), enabling hardware implementations to help speed up the processing.
The IEEE 754 double precision data type is implemented in 64-bits internally, as follows:
- 1 bit sign
- 11 bit binary exponent
- 53 bit fraction (52 bits stored)
Numbers are normalized so they can be represented as (sign) · (1 + x) · 2exp, so the “1” can be thrown away and thereby store 53 bits in 52.
The number 1 is stored as 1.0 · 20 [0; 0; 0], the number 10 as 1.25 · 23 [0; 3; 0,25] and -5 as (-1) · 1.25 · 22 [-1; 2; 0,25]
The largest positive number representable is (2-252) · 21023 or approximately 1.7976931348623157 · 10308 and the smallest positive number greater than zero is 2-1074 or approximately 4.9 · 10-324.
Almost anything is possible with this data type. Or so it seems…
But, there are at least 2 major challenges with the double data type that makes it unsuitable for financial calculations:
- The internal representation is binary. Numbers that are “natural” in the decimal number system cannot be represented in the binary number system.
- The data type is optimized for multiplication and division. Multiplication is done by checking signs, muliplying fractions and adding/adjusting exponents, all integer operations. Addition and subtraction are done by denormalizing the numerical smallest number (basically cutting off bits to the right) in order to have the same exponent, before the addition/subtraction is done.
This again means that there are significant loss of information when adding/subtracting large and small numbers.
A “nice” example of a natural number in the decimal number system that is not “representable” in the binary number system is 0.110 (one thenth) which is (approximately) 1.100110011…0011012 · 2-4 or about 0.100000000000000005551115123125782702118158340454101562510.
If one attempts to add 0.110 to itself 10 times using
double, printing each iteration, the following output is expected:
0.1 0.2 0.30000000000000004 0.4 0.5 0.6 0.7 0.7999999999999999 0.8999999999999999 0.9999999999999999
Please note that the issue is not isolated to JAVA, but is a feature of the IEEE implementation of binary floating point numbers.
An example of lost information/precision is the following integer addition: 9007199254740992 + 3 (253 + 3). Anybody can do the math and conclude the answer must be: 9007199254740995. That is the result when doing the addition using e.g. a 64-bit integer (
But, if this calculation is done using
double, the result is 9007199254740996! The reason is that the number 3 is right-shifted once to denormalize it to the same exponent as the other addend. The bit that is “shifted out” is used to do rounding. So 3 · 2n becomes 1.5 · 2n+1 that again is rounded to 2 · 2n+1 or just 4 · 2 n that when added to 900…2 becomes 900…6.
Doing the same maths with similar numbers, just divided by 1024 (210) gives the following:
8796093022208 + 0.0029296875 = 8796093022208.0040000000…
When doing bookkeeping/accounting, it is common to add large numbers with relatively small numbers; it is expected that we keep track of down to a hundredths of the local currency, and that multiple additions with a few multiplications add up, even on the last digit. Then, the
double data type might not be the right choice.
Integer, fixed decimal point
A popular and often used solution to the issues with the
double data type is to simply use an integer type instead, and have a fixed decimal position. Most currencies have two decimals to the right of the decimal point. Implementing those currencies using an integer is basically the same as counting in cents, øre, pence, fen…
In JAVA it is natural to use a
long for this kind of computations. The
long data type can hold integers up to 9223372036854775807. And if it is decided to have 2 extra decimals for intermediate results, the largest number representable is 922337203685477.5807. This should be appropriate for most applications…
Addition and subtraction is easy this way, performance is good and the results are exact to the last decimal. Excellent for business calculations.
Multiplication is an issue as multiplying 2 numbers with 4 decimals each results in a number with up to 8 decimals and rounding is necessary to keep the result with 4 decimals. Equally there is a problem if a
long is used. Multiplying 2
longs with 64 bits each result with a product with up to 128 bits; not easily manageable in JAVA…
Divisions are even worse…
And I know of real-life systems where developers forgot to do the “translation” between amounts in øre and amounts in kr, when interfacing external systems. Those errors are usually found and fixed early, but can be avoided if the translation is unnecessary.
Using (64 bit) integer arithmetic when doing financial calculations is only reasonable when those calculations are mostly additions and possibly a finally VAT calculation in the end.
And the calculations need to be encapsulated and/or the developers must stay “sharp” to remember when to calculate in øre/cents and when to calculate in kr/euros/dollars…
And we all know… If the developers has to know/do something, they will eventually forget…
With J2SE 5.0 the
BigDecimal class was greatly enhanced, making it a choice for precision calculations, not necessarily just simple financial computations. Refer to JSR-13.
BigDecimal implementation now contains a few nice constants (ZERO, ONE, TEN), a
valueOf factory method that creates a
BigDecimal from a
double without the issues the similar constructor has, and a lot of new methods where rounding-mode can be given as argument.
BigDecimal works with an unscaled value contained in a
BigInteger and a decimal scale (decimal position), contained in an integer. There is almost no limit to how large numbers can be contained; the
BigInteger can represent large integers only limited by the systems resources. The scale allows number in the range 10-2147483648 to 102147483647. Reasonably large numbers 🙂
Addition and subtraction between
BigDecimals are done by “expanding” the number with the smallest scale before the operation is carried out, thereby preserving precision even on the last digit.
Multiplication expands the scale to the sum of the numbers scale. By simple division, it is expected that the number can be contained with a scale that is the difference between the two numbers. If the result cannot be represented exactly, a resulting scale and rounding mode should be specified or an exception is thrown.
This data type is an ideal choice for financial calculations, as it is guaranteed that the sum or difference is exact to the last digit. Also the many different rounding-modes allows it to be used in business calculations where the simple grammar-school “round-half-up” is not sufficient.
BigDecimal to add 0.1 to itself 10 times, yields the following result:
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
The addition 9007199254740992 + 3 (253 + 3) equals 9007199254740995.
The same addition with the numbers divided by 1024 and a scale of 10, yields:
8796093022208 + 0.0029296875 = 8796093022208.0029296875
The same calculation with scale=3 equals:
8796093022208 + 0.003 = 8796093022208.003.
A lot better than the similar calculations with
A few notes regarding the use of
- Don’t use the
BigDecimal(double)constructor, try to use the factory
valueOf(...)methods whenever possible.
falsewith numerical identical numbers if they have different scale. Use the
compareTo(...)method if “identical” numbers can be represented with different scale.
- Testing for zero is most easily done with the
signum() == 0.
- Powers of ten are most efficiently implemented using one of the
The following show 3 different ways of expressing the value “0.1” using
1 2 3
BigDecimal t1 = BigDecimal.valueOf(0.1); BigDecimal t2 = BigDecimal.valueOf(1, 1); BigDecimal t3 = new BigDecimal("0.1");
The second (line 2) being the most efficient.
The following tests performance of calculations using
BigDecimal compared to
double. I made a small test-program. It adds 10 numbers, then adds 25% VAT to the sum. Equal implementations using
BigDecimal. Then it does 1 million of the same calculations for each implementation, 3 times and prints the timings (JAVA source for documentation).
On my old desktop running Windows 2000, the following result is given:
sumAddVatDoubles: 56300379483513.5900 sumAddVatBigDecimals: 56300379483513.6000 0.0 0.0000 double (1 million runs): 0.081656701 0.0 BigDecimal (1 million runs): 5.275915392 0.0000 double (1 million runs): 0.07946241400000001 0.0 BigDecimal (1 million runs): 5.7855323510000005 0.0000 double (1 million runs): 0.07896082800000001 0.0 BigDecimal (1 million runs): 5.279168778000001 0.0000
java -fullversion says:
java full version "1.6.0_17-b04"
On a recent developer PC running an IBM jre, I get the following results:
sumAddVatDoubles: 56300379483513.5900 sumAddVatBigDecimals: 56300379483513.6000 0.0 0.0000 double (1 million runs): 0.010842277 0.0 BigDecimal (1 million runs): 0.37696358 0.0000 double (1 million runs): 0.010783957 0.0 BigDecimal (1 million runs): 0.375765779 0.0000 double (1 million runs): 0.010842014 0.0 BigDecimal (1 million runs): 0.37761553800000003 0.0000
java -fullversion says:
java full version "JRE 1.6.0 IBM Windows 32 build pwi3260sr4ifx-20090228_01 (SR4)"
Conclusion: The performance of using
BigDecimal is 40-70 times slower than using
- Wikipedia: Floating point
- Wikipedia: Computer numbering formats
- Wikipedia: IEEE 754
- Microsoft about Excel
- Sun: API reference, Double
- Sun: API reference, BigDecimal
- Sun: JSR-13
NOTE: This article has been updated 2010-02-08 17:15. There was a stupid error in my test-program giving about equal performance of double and BigDecimal…