As well as BigDecimal, decimals can have type Float or Double. Unlike BigDecimal which has no size limit, Float and Double are fixed-size, and thus more efficient in calculations. BigDecimal stores its value as base-10 digits, while Float and Double store their values as binary digits. So although using them is more efficient in calculations, the result of calculations will not be as exact as in base-10, eg, 3.1f + 0.4f computes to 3.499999910593033, instead of 3.5.
We can force a decimal to have a specific type other than BigDecimal by giving a suffix (F for Float, D for Double):
assert 1.200065d.class == Doubleassert 1.234f.class == Floatassert (-1.23E23D).class == Doubleassert (1.167g).class == BigDecimal
//although g suffix here is optional, it makes examples more readable
We can enquire the minimum and maximum values for Floats and Doubles:
We can represent infinities by using some predefined constants (prefixed by either Float or Double):
assert (1f / 0f) == Double.POSITIVE_INFINITY
assert (-1f / 0f) == Double.NEGATIVE_INFINITY
assertDouble.POSITIVE_INFINITY == Float.POSITIVE_INFINITY
assert 0.0f != -(0.0f)
//positive and negative zeroes not equal, when negative is written -(0.0f)
assert 0.0f == -0.0f
//but when negative is written -0.0f, it's evaluated as positive
If a nonzero Double literal is too large or too small, it's represented by Double.POSITIVE_INFINITY or Double.NEGATIVE_INFINITY or 0.0:
There's a special variable called Double.NaN (and Float.NaN), meaning "Not a Number", which is sometimes returned from math calculations. Once introduced into a math calculation, the result will (usually) be NaN.
Conversions
The Float and Double classes, along with BigDecimal, BigInteger, Integer, Long, Short, and Byte, can all be converted to one another.
Converting numbers to integers may involve rounding or truncation:
assert 45.76f as int == 45i //truncated
assert 45.76d as int == 45i
assert 45.76f.toInteger() == 45i //method name
assert 45.76f.toLong() == 45L
assert 200.8f as byte == -56 as byte//sign reversed after truncation
assert 45.76f.toBigInteger() == 45
Converting from integers to float or double (may involve rounding):
assert 789g as Float == 789f
assert 45i.toFloat() == 45f //method name
assert 789g.toFloat() == 789f
assert 789g.floatValue() == 789f //alternative method name
assert 45i as double == 45d
assert 6789g.toDouble() == 6789d //method name
assert 6789g.doubleValue() == 6789d //alternative method name
assertnew BigInteger( '1' + '0'*40 ).floatValue() == Float.POSITIVE_INFINITY
//one with 40 zeroes after it
assertnew BigInteger( '1234567890' * 3 ).floatValue() == 1.2345679e29f
//precision lost on conversion
Converting from BigDecimal to float or double (may involve rounding):
We can create a Float or Double from a string representation of the number, either base-10 or hex:
[ '77', '1.23e-23', '4.56', '-1.7E1', '98.7e2', '-0.27e-30' ].each{
assert it.toFloat()
assertnewFloat(it)
assert it.toDouble()
assertnewDouble(it)
}
assertnewFloat( ' 1.23e-23 ' ) //leading and trailing whitespace removed
try{ newFloat( null ); assert 0 }
catch(e){ assert e instanceof NullPointerException }
[ 'NaN', '-NaN', 'Infinity', '-Infinity', '+Infinity' ].each{
assertnewFloat(it)
}
assertnewFloat( ' -0Xabc.defP7' )
//we can have hexadecimal mantissa, with P indicating exponent
assertnewFloat( ' 0xABC.DEFp17 ' )
//part after P must be base-10, not more hex
assertnewFloat( '0X.defP-3f \n' )
//any whitespace OK (spaces, tabs, newlines, carriage returns, etc)
try{ newFloat( ' @0X6azQ/3d' ); assert 0 }
catch(e){ assert e instanceof NumberFormatException }
//because the string doesn't contain a parsable number in the form of a FloatassertFloat.valueOf( '0xABp17' )
//alternate means of contructing float from string representation
assertFloat.parseFloat( '0xABp17' )
//another alternate means of contructing float from string
assertnewDouble( '0x12bc.89aP7d ' )
The string is first converted to a double, then if need be converted to a float.
Converting from double to BigDecimal is only exact when the double has an exact binary representation, eg, 0.5, 0.25. If a float is supplied, it's converted to a double first, then given to the BigDecimal constructor. The scale of the returned BigDecimal is the smallest value such that (10**scale * val) is an integer.
assertnew BigDecimal(0.25d) == 0.25
//exact conversion because 0.25 has an exact binary representation
assertnew BigDecimal(0.1d) ==
0.1000000000000000055511151231257827021181583404541015625
(0.1d).toBigDecimal() == new BigDecimal(0.1d) //alternative method name
assertnew BigDecimal(0.1f) == 0.100000001490116119384765625
//inexact conversion as 0.1 has a recurring decimal part in binary
assert (0.1f as BigDecimal) == 0.100000001490116119384765625
assertnew BigDecimal(0.1d, new java.math.MathContext(25) ) ==
0.1000000000000000055511151 //rounds to 25 places as specified
A more exact way to convert a double to a BigDecimal:
assert BigDecimal.valueOf( 0.25d ) == 0.25
assert BigDecimal.valueOf( 0.1d ) == 0.1
//always exact, because converts double to a string first
assertnew BigDecimal( Double.toString( 0.1d ) ) == 0.1
//explicitly convert double to string, then to BigDecimal
assert BigDecimal.valueOf( -23.456e-17d ) == -2.3456E-16
assert BigDecimal.valueOf( -23.456e-17f ) == -2.3455999317674643E-16
//result inexact because float converted to double first
try{ BigDecimal.valueOf( Double.POSITIVE_INFINITY ); assert 0 }
catch(e){ assert e instanceof NumberFormatException }
try{ BigDecimal.valueOf( Double.NaN ); assert 0 }
catch(e){ assert e instanceof NumberFormatException }
//however, infinities and NaN won't convert that way
We can convert a float or double to a unique string representation in base-10. There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type float. (The returned string must be for the float value nearest to the exact mathematical value supplied; if two float representations are equally close to that value, then the string must be for one of them and the least significant bit of the mantissa must be 0.)
assertFloat.toString( 3.0e6f ) == '3000000.0' //no leading zeros
assertFloat.toString( 3.0e0f ) == '3.0' //at least one digit after the point
assertFloat.toString( 3.0e-3f ) == '0.0030'
assertFloat.toString( 3.0e7f ) == '3.0E7'
//exponent used if it would be > 6 or < -3
assertFloat.toString( 3.0e-4f ) == '3.0E-4' //mantissa >= 1 and < 10
We can also convert a float or double to a hexadecimal string representation:
[ 0.0f: '0x0.0p0',
(-0.0f): '0x0.0p0', //no negative sign in hex string rep'n of -0.0f
1.0f: '0x1.0p0', //most returned strings begin with '0x1.' or '-0x1.'
2.0f: '0x1.0p1',
3.0f: '0x1.8p1',
5.0f: '0x1.4p2',
(-1.0f): '-0x1.0p0',
0.5f: '0x1.0p-1',
0.25f: '0x1.0p-2',
(Float.MAX_VALUE): '0x1.fffffep127',
(Float.MIN_VALUE): '0x0.000002p-126',
//low values beginning with '0x0.' are called 'subnormal'
(Float.NEGATIVE_INFINITY): '-Infinity',
(Float.NaN): 'NaN',
].each{ k, v->
assertFloat.toHexString(k) == v
}
We can format integers and decimals using String.format():
//Integers ('d')
assertString.format('%d', 45) == '45'
assertString.format('%5d,%1$5o', 46L) == ' 46, 56'
//octal format; each minimum 5 chars wide; use an argument twice
assertString.format('%-4d,%<-5x', 47g) == '47 ,2f '
//hex format without leading '0x'; left-justified with '-';
//shortcut ('<') for using argument again
assertString.format('%2d,%<1X', 123) == '123,7B'
//hex in uppercase with capital 'X'
assertString.format('%04d', 34) == '0034' //zero-pad
assertString.format('%,5d', 12345) == '12,345' //use grouping-separators
assertString.format('%+3d,%2$ 3d', 123L, 456g) == '+123, 456'
//always use plus sign; always use a leading space
assertString.format('%(3d', -789 as short) == '(789)' //parens for negative
assertString.format('%(3o,%2$(3x,%3$(3X', 123g, 456g, -789g) == '173,1c8,(315)'
//neg octal/hex only for BigInteger
//Floating-Point ('f', 'a', 'e', 'g')
assertString.format('e = %f', Math.E) == 'e = 2.718282'
//default 'f' format is 7.6
assertString.format('e=%+6.4f', Math.E) == 'e=+2.7183'
//precision is digits after decimal point
assertString.format('$ %(,6.2f', -6217.58) == '$ (6,217.58)'
//'(' flag gives parens, ',' uses separators
assertString.format('%a, %A', 2.7182818f, Math.PI) ==
'0x1.5bf0a8p1, 0X1.921FB54442D18P1' //'a' for hex
assertString.format('%+010.4a', 23.25d) == '+0x001.7400p4'
//'+' flag always includes sign; '0' flag zero-fills
assertString.format('%e, %10.4e', Math.E, 12345.6789) ==
'2.718282e+00, 1.2346e+04' //'e' for scientific format
assertString.format('%(10.5E', -0.0000271) == '(2.71000E-05)'
assertString.format('%g, %10.4G', Math.E, 12345.6789) == '2.71828, 1.235E+04'
//'f' or 'e', depending on input
Floating-Point Arithmetic
We can perform the same basic operations that integers and BigDecimal can:
We can convert a float to the equivalent int bits, or a double to equivalent float bits. For a float, bit 31(mask 0x80000000) is the sign, bits 30-23 (mask 0x7f800000) are the exponent, and bits 22-0 (mask 0x007fffff) are the mantissa. For a double, bit 63 is the sign, bits 62-52 are the exponent, and bits 51-0 are the mantissa.
The methods floatToRawIntBits() and doubleToRawLongBits() act similarly, except that they preserve Not-a-Number (NaN) values. So If the argument is NaN, the result is the integer or long representing the actual NaN value produced from the last calculation, not the canonical Float.NaN value to which all the bit patterns encoding a NaN can be collapsed (ie, 0x7f800001 through 0x7fffffff and 0xff800001 through 0xffffffff).
The intBitsToFloat() and longBitsToDouble() methods act oppositely. In all cases, giving the integer resulting from calling Float.floatToIntBits() or Float.floatToRawIntBits() to the intBitsToFloat(int) method will produce the original floating-point value, except for a few NaN values. Similarly with doubles. These methods are the only operations that can distinguish between two NaN values of the same type with different bit patterns.
Accuracy of the Math methods is measured in terms of such ulps for the worst-case scenario.If a method always has an error less than 0.5 ulps, the method always returns the floating-point number nearest the exact result, and so is always correctly rounded. However, doing this and maintaining floating-point calculation speed together is impractical. Instead, for the Math class, a larger error bound of 1 or 2 ulps is allowed for certain methods. But most methods with more than 0.5 ulp errors are still required to be semi-monotonic, ie, whenever the mathematical function is non-decreasing, so is the floating-point approximation, and vice versa. Not all approximations that have 1 ulp accuracy meet the monotonicity requirements. sin, cos, tan, asin, acos, atan, exp, log, and log10 give results within 1 ulp of the exact result that are semi-monotonic.
Further Calculations
We can find the polar coordinate of two (x,y) coordinates. The result is within 2 ulps of the exact result, and is semi-monotonic.
We can calculate (E**x)-1 (1 + x) in one call. For values of x near 0, Math.expm1( x ) + 1d is much closer than Math.exp( x ) to the true result of e**x. The result will be semi-monotonic, and within 1 ulp of the exact result. Once the exact result of e**x - 1 is within 1/2 ulp of the limit value -1, -1d will be returned.
We can also calculate ln(1 + x) in one call. For small values of x, Math.log1p( x ) is much closer than Math.log(1d + x) to the true result of ln(1 + x). The result will be semi-monotonic, and within 1 ulp of the exact result.
Scale binary scalb(x,y) calculates (x * y**2) using a single operation, giving a more accurate result. If the exponent of the result would be larger than Float/Double.MAX_EXPONENT, an infinity is returned. If the result is subnormal, precision may be lost. When the result is non-NaN, the result has the same sign as x.
We can round doubles to the nearest long (or floats to the nearest integer). The calculation is Math.floor(a + 0.5d) as Long, or Math.floor(a + 0.5f) as Integer
[ 7.45: 7,
7.5: 8,
(-3.95): -4,
(-3.5): -3,
(Double.NaN): 0,
(Double.NEGATIVE_INFINITY): Long.MIN_VALUE,
(Long.MIN_VALUE as Double): Long.MIN_VALUE,
(Double.POSITIVE_INFINITY): Long.MAX_VALUE,
(Long.MAX_VALUE as Double): Long.MAX_VALUE,
].each{ k, v -> assertMath.round( k ) == v }
Unlike the numerical comparison operators, max() and min() considers negative zero to be strictly smaller than positive zero. If one argument is positive zero and the other negative zero, the result is positive zero.
assertMath.max( 7i, 9i ) == 9i //returns the same class as its arguments
assertMath.min( 23L, 19L ) == 19L
assertMath.min( 1.7f, 0.3f ) == 0.3f
assertMath.min( -6.7d, 1.3d ) == -6.7d
assertMath.min( 7i, 9L ) == 7L //converts result to most precise type of argument
assertMath.min( 1L, 3.3f ) == 1f
assertMath.min( -6.7f, 1.3d ) == -6.699999809265137d
The pow() method returns the value of the first argument raised to the power of the second argument. If both arguments are integers, then the result is exactly equal to the mathematical result of raising the first argument to the power of the second argument if that result can in fact be represented exactly as a double value. Otherwise, special rules exist for processing zeros and infinities:
def Infinity= Double.POSITIVE_INFINITY, NaN= Double.NaN
[
[ 3d, 0d ]: 1d,
[ 3d, -(0d) ]: 1d,
[ 3d, 1d ]: 3d,
[ 3d, Infinity ]: Infinity,
[ -3d, Infinity ]: Infinity,
[ 0.3d, -Infinity ]: Infinity,
[ -0.3d, -Infinity ]: Infinity,
[ 3d, -Infinity ]: 0d,
[ -3d, -Infinity ]: 0d,
[ 0.3d, Infinity ]: 0d,
[ -0.3d, Infinity ]: 0d,
[ 1d, Infinity ]: Double.NaN,
[ 0d, 1d ]: 0d,
[ Infinity, -1d ]: 0d,
[ 0d, -1d ]: Infinity,
[ Infinity, 1d ]: Infinity,
[ -(0d), 2d ]: 0d, //exponent >0 but not finite odd integer
[ -Infinity, -2d ]: 0d, //exponent <0 but not finite odd integer
[ -(0d), 3d ]: -(0d), //exponent is positive finite odd integer
[ -Infinity, -3d ]: -(0d), //exponent is negative finite odd integer
[ -(0d), -2d ]: Infinity, //exponent <0 but not finite odd integer
[ -Infinity, 2d ]: Infinity, //exponent >0 but not finite odd integer
[ -(0d), -3d ]: -Infinity, //exponent is negative finite odd integer
[ -Infinity, 3d ]: -Infinity, //exponent is positive finite odd integer
[ -3d, 4i ]: {-> def a= Math.abs(-3d); a*a*a*a }(),
//exponent is finite even integer
[ -3d, 5i ]: {-> def a= Math.abs(-3d); -a*a*a*a*a }(),
//exponent is finite odd integer
[ -3d, 2.5 ]: NaN, //exponent is finite and not an integer
[ NaN, 0d ]: 1d //exception to the NaN ripple rule
].each{k, v->
assertMath.pow( k[0], k[1] ) == v
}
More methods:
assertMath.random() >= 0d //this method uses new Random() when first called
assertMath.random() < 1d
assertMath.signum( 17.75d ) == 1d
assertMath.signum( 17.75f ) == 1f
assertMath.signum( -19.5d ) == -1d
assertMath.signum( 0d ) == 0d
assertMath.signum( -(0d) ) == -(0d)
We can use copySign() to return a first argument with the sign of the second argument.
We can compute the hypotenuse with risk of intermediate overflow (or underflow). The computed result is within 1 ulp of the exact result. If one parameter is held constant, the results will be semi-monotonic in the other parameter.
The result is NaN if the argument is NaN for ulp, sin, cos, tan, asin, acos, atan, exp, log, log10, sqrt, cbrt, IEEEremainder, ceil, floor, rint, atan2, abs, max, min, signum, sinh, cosh, tanh, expm1, log1p, nextAfter, and nextUp.
But not so with pow, round, hypot, copySign, getExponent, and scalb.
There's another math library called StrictMath that's a mirror of Math, with exactly the same methods. However, some methods (eg, sin, cos, tan, asin, acos, atan, exp, log, log10, cbrt, atan2, pow, sinh, cosh, tanh, hypot, expm1, and log1p) follow stricter IEEE rules about what values must be returned. For example, whereas the Math.copySign method usually treats some NaN arguments as positive and others as negative to allow greater performance, the StrictMath.copySign method requires all NaN sign arguments to be treated as positive values.