Class Float16
java.lang.Object
java.lang.Number
jdk.incubator.vector.Float16
- All Implemented Interfaces:
Serializable,Comparable<Float16>
The
Float16 is a class holding 16-bit data
in IEEE 754 binary16 format.
Binary16 Format:
S EEEEE MMMMMMMMMM
Sign - 1 bit
Exponent - 5 bits
Significand - 10 bits (does not include the implicit bit
inferred from the exponent, see PRECISION)
Unless otherwise specified, the methods in this class use a rounding policy (JLS 15.4) of round to nearest.
This is a value-based class; programmers should treat instances that are equal as interchangeable and should not use instances for synchronization, or unpredictable behavior may occur. For example, in a future release, synchronization may fail.
Floating-point Equality, Equivalence, and Comparison
The classjava.lang.Double has a discussion of equality,
equivalence, and comparison of floating-point values that is
equally applicable to Float16 values.
Decimal ↔ Binary Conversion Issues
The discussion of binary to decimal conversion issues injava.lang.Double is also
applicable to Float16 values.- API Note:
- The methods in this class generally have analogous methods in
either
Float/DoubleorMath/StrictMath. Unless otherwise specified, the handling of special floating-point values such as NaN values, infinities, and signed zeros of methods in this class is wholly analogous to the handling of equivalent cases by methods inFloat,Double,Math, etc. - Since:
- 24
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intThe number of bytes used to represent aFloat16value, 2.static final intMaximum exponent a finiteFloat16variable may have, 15.static final Float16A constant holding the largest positive finite value of typeFloat16, (2-2-10)·215, numerically equal to 65504.0.static final intMinimum exponent a normalizedFloat16variable may have, -14.static final Float16A constant holding the smallest positive normal value of typeFloat16, 2-14.static final Float16A constant holding the smallest positive nonzero value of typeFloat16, 2-24.static final Float16A constant holding a Not-a-Number (NaN) value of typeFloat16.static final Float16A constant holding the negative infinity of typeFloat16.static final Float16A constant holding the positive infinity of typeFloat16.static final intThe number of bits in the significand of aFloat16value, 11.static final intThe number of bits used to represent aFloat16value, 16. -
Method Summary
Modifier and TypeMethodDescriptionstatic Float16Returns the absolute value of the argument.static Float16Adds twoFloat16values together as per the+operator semantics using the round to nearest rounding policy.byteReturns the value of thisFloat16as abyteafter a narrowing primitive conversion.static intCompares the two specifiedFloat16values.intCompares twoFloat16objects numerically.static Float16Returns the first floating-point argument with the sign of the second floating-point argument.static Float16Divides twoFloat16values as per the/operator semantics using the round to nearest rounding policy.doubleReturns the value of thisFloat16as adoubleafter a widening primitive conversion.booleanCompares this object against the specified object.static shortReturns a representation of the specified floating-point value according to the IEEE 754 floating-point binary16 bit layout.static shortReturns a representation of the specified floating-point value according to the IEEE 754 floating-point binary16 bit layout.floatReturns the value of thisFloat16as afloatafter a widening primitive conversion.static Float16Returns the fused multiply add of the three arguments; that is, returns the exact product of the first two arguments summed with the third argument and then rounded once to the nearestFloat16.static intgetExponent(Float16 f16) Returns the unbiased exponent used in the representation of aFloat16.inthashCode()Returns a hash code for thisFloat16object.static intReturns a hash code for aFloat16value; compatible withFloat16.hashCode().intintValue()Returns the value of thisFloat16as anintafter a narrowing primitive conversion.static booleanReturnstrueif the argument is a finite floating-point value; returnsfalseotherwise (for NaN and infinity arguments).static booleanisInfinite(Float16 f16) Returnstrueif the specified number is infinitely large in magnitude,falseotherwise.static booleanReturnstrueif the specified number is a Not-a-Number (NaN) value,falseotherwise.longReturns value of thisFloat16as alongafter a narrowing primitive conversion.static Float16Returns the larger of twoFloat16values.static Float16Returns the smaller of twoFloat16values.static Float16Multiplies twoFloat16values as per the*operator semantics using the round to nearest rounding policy.static Float16Returns the negation of the argument.static Float16Returns the floating-point value adjacent tovin the direction of negative infinity.static Float16Returns the floating-point value adjacent tovin the direction of positive infinity.static Float16Returnsv× 2scaleFactorrounded as if performed by a single correctly rounded floating-point multiply.static Float16shortBitsToFloat16(short bits) Returns theFloat16value corresponding to a given bit representation.shortReturns the value of thisFloat16as ashortafter a narrowing primitive conversion.static Float16Returns the signum function of the argument; zero if the argument is zero, 1.0 if the argument is greater than zero, -1.0 if the argument is less than zero.static Float16Returns the square root of the operand.static Float16Subtracts twoFloat16values as per the-operator semantics using the round to nearest rounding policy.static StringtoHexString(Float16 f16) Returns a hexadecimal string representation of theFloat16argument.toString()Returns a string representation of thisFloat16.static StringReturns a string representation of theFloat16argument.static Float16Returns the size of an ulp of the argument.static Float16valueOf(double d) Returns aFloat16value rounded from thedoubleargument using the round to nearest rounding policy.static Float16valueOf(float f) Returns aFloat16value rounded from thefloatargument using the round to nearest rounding policy.static Float16valueOf(int value) Returns the value of anintconverted toFloat16.static Float16valueOf(long value) Returns the value of alongconverted toFloat16.static Float16Returns aFloat16holding the floating-point value represented by the argument string.static Float16Returns aFloat16value rounded from theBigDecimalargument using the round to nearest rounding policy.
-
Field Details
-
POSITIVE_INFINITY
A constant holding the positive infinity of typeFloat16.- See Also:
-
NEGATIVE_INFINITY
A constant holding the negative infinity of typeFloat16.- See Also:
-
NaN
A constant holding a Not-a-Number (NaN) value of typeFloat16.- See Also:
-
MAX_VALUE
A constant holding the largest positive finite value of typeFloat16, (2-2-10)·215, numerically equal to 65504.0.- See Also:
-
MIN_NORMAL
A constant holding the smallest positive normal value of typeFloat16, 2-14.- See Also:
-
MIN_VALUE
A constant holding the smallest positive nonzero value of typeFloat16, 2-24.- See Also:
-
SIZE
public static final int SIZEThe number of bits used to represent aFloat16value, 16.- See Also:
-
PRECISION
public static final int PRECISIONThe number of bits in the significand of aFloat16value, 11. This corresponds to parameter N in section 4.2.3 of The Java Language Specification.- See Also:
-
MAX_EXPONENT
public static final int MAX_EXPONENTMaximum exponent a finiteFloat16variable may have, 15. It is equal to the value returned byFloat16.getExponent(Float16.MAX_VALUE).- See Also:
-
MIN_EXPONENT
public static final int MIN_EXPONENTMinimum exponent a normalizedFloat16variable may have, -14. It is equal to the value returned byFloat16.getExponent(Float16.MIN_NORMAL).- See Also:
-
BYTES
public static final int BYTESThe number of bytes used to represent aFloat16value, 2.- See Also:
-
-
Method Details
-
toString
Returns a string representation of theFloat16argument. The behavior of this method is analogous toFloat.toString(float)in the handling of special values (signed zeros, infinities, and NaN) and the generation of a decimal string that will convert back to the argument value.- Parameters:
f16- theFloat16to be converted.- Returns:
- a string representation of the argument.
- See Also:
-
toHexString
Returns a hexadecimal string representation of theFloat16argument. The behavior of this class is analogous toFloat.toHexString(float)except that an exponent value of"p-14"is used for subnormalFloat16values.- API Note:
- This method corresponds to the convertToHexCharacter operation defined in IEEE 754.
- Parameters:
f16- theFloat16to be converted.- Returns:
- a hex string representation of the argument.
- See Also:
-
valueOf
Returns the value of anintconverted toFloat16.- API Note:
- This method corresponds to the convertFromInt operation defined in IEEE 754.
- Parameters:
value- anintvalue.- Returns:
- the value of an
intconverted toFloat16
-
valueOf
Returns the value of alongconverted toFloat16.- API Note:
- This method corresponds to the convertFromInt operation defined in IEEE 754.
- Parameters:
value- alongvalue.- Returns:
- the value of a
longconverted toFloat16
-
valueOf
Returns aFloat16value rounded from thefloatargument using the round to nearest rounding policy.- API Note:
- This method corresponds to the convertFormat operation defined in IEEE 754.
- Parameters:
f- afloat- Returns:
- a
Float16value rounded from thefloatargument using the round to nearest rounding policy
-
valueOf
Returns aFloat16value rounded from thedoubleargument using the round to nearest rounding policy.- API Note:
- This method corresponds to the convertFormat operation defined in IEEE 754.
- Parameters:
d- adouble- Returns:
- a
Float16value rounded from thedoubleargument using the round to nearest rounding policy
-
valueOf
Returns aFloat16holding the floating-point value represented by the argument string. The grammar of strings accepted by this method is the same as that accepted byDouble.valueOf(String). The rounding policy is also analogous to the one used by that method, a valid input is regarded as an exact numerical value that is rounded once to the nearest representableFloat16value.- API Note:
- This method corresponds to the convertFromDecimalCharacter and convertFromHexCharacter operations defined in IEEE 754.
- Parameters:
s- the string to be parsed.- Returns:
- the
Float16value represented by the string argument. - Throws:
NullPointerException- if the string is nullNumberFormatException- if the string does not contain a parsableFloat16.- See Also:
-
valueOf
Returns aFloat16value rounded from theBigDecimalargument using the round to nearest rounding policy.- Parameters:
v- aBigDecimal- Returns:
- a
Float16value rounded from theBigDecimalargument using the round to nearest rounding policy
-
isNaN
Returnstrueif the specified number is a Not-a-Number (NaN) value,falseotherwise.- API Note:
- This method corresponds to the isNaN operation defined in IEEE 754.
- Parameters:
f16- the value to be tested.- Returns:
trueif the argument is NaN;falseotherwise.- See Also:
-
isInfinite
Returnstrueif the specified number is infinitely large in magnitude,falseotherwise.- API Note:
- This method corresponds to the isInfinite operation defined in IEEE 754.
- Parameters:
f16- the value to be tested.- Returns:
trueif the argument is positive infinity or negative infinity;falseotherwise.- See Also:
-
isFinite
Returnstrueif the argument is a finite floating-point value; returnsfalseotherwise (for NaN and infinity arguments).- API Note:
- This method corresponds to the isFinite operation defined in IEEE 754.
- Parameters:
f16- theFloat16value to be tested- Returns:
trueif the argument is a finite floating-point value,falseotherwise.- See Also:
-
byteValue
public byte byteValue()Returns the value of thisFloat16as abyteafter a narrowing primitive conversion.- Overrides:
byteValuein classNumber- Returns:
- the value of this
Float16as abyteafter a narrowing primitive conversion - See Java Language Specification:
-
5.1.3 Narrowing Primitive Conversion
-
toString
-
shortValue
public short shortValue()Returns the value of thisFloat16as ashortafter a narrowing primitive conversion.- Overrides:
shortValuein classNumber- Returns:
- the value of this
Float16as ashortafter a narrowing primitive conversion - See Java Language Specification:
-
5.1.3 Narrowing Primitive Conversion
-
intValue
public int intValue()Returns the value of thisFloat16as anintafter a narrowing primitive conversion.- Specified by:
intValuein classNumber- API Note:
- This method corresponds to the convertToIntegerTowardZero operation defined in IEEE 754.
- Returns:
- the value of this
Float16as anintafter a narrowing primitive conversion - See Java Language Specification:
-
5.1.3 Narrowing Primitive Conversion
-
longValue
public long longValue()Returns value of thisFloat16as alongafter a narrowing primitive conversion.- Specified by:
longValuein classNumber- API Note:
- This method corresponds to the convertToIntegerTowardZero operation defined in IEEE 754.
- Returns:
- value of this
Float16as alongafter a narrowing primitive conversion - See Java Language Specification:
-
5.1.3 Narrowing Primitive Conversion
-
floatValue
public float floatValue()Returns the value of thisFloat16as afloatafter a widening primitive conversion.- Specified by:
floatValuein classNumber- API Note:
- This method corresponds to the convertFormat operation defined in IEEE 754.
- Returns:
- the value of this
Float16as afloatafter a widening primitive conversion - See Java Language Specification:
-
5.1.2 Widening Primitive Conversion
-
doubleValue
public double doubleValue()Returns the value of thisFloat16as adoubleafter a widening primitive conversion.- Specified by:
doubleValuein classNumber- API Note:
- This method corresponds to the convertFormat operation defined in IEEE 754.
- Returns:
- the value of this
Float16as adoubleafter a widening primitive conversion - See Java Language Specification:
-
5.1.2 Widening Primitive Conversion
-
hashCode
public int hashCode()Returns a hash code for thisFloat16object. The general contract ofObject#hashCode()is satisfied. All NaN values have the same hash code. Additionally, all distinct numerical values have unique hash codes; in particular, negative zero and positive zero have different hash codes from each other. -
hashCode
Returns a hash code for aFloat16value; compatible withFloat16.hashCode().- Parameters:
value- the value to hash- Returns:
- a hash code value for a
Float16value.
-
equals
Compares this object against the specified object. The result istrueif and only if the argument is notnulland is aFloat16object that represents aFloat16that has the same value as thedoublerepresented by this object.- Overrides:
equalsin classObject- Parameters:
obj- the reference object with which to compare.- Returns:
trueif this object is the same as the obj argument;falseotherwise.- See Java Language Specification:
-
15.21.1 Numerical Equality Operators == and !=
- See Also:
-
float16ToRawShortBits
Returns a representation of the specified floating-point value according to the IEEE 754 floating-point binary16 bit layout.- Parameters:
f16- aFloat16floating-point number.- Returns:
- the bits that represent the floating-point number.
- See Also:
-
float16ToShortBits
Returns a representation of the specified floating-point value according to the IEEE 754 floating-point binary16 bit layout. All NaN values return the same bit pattern asNaN.- Parameters:
f16- aFloat16floating-point number.- Returns:
- the bits that represent the floating-point number.
- See Also:
-
shortBitsToFloat16
Returns theFloat16value corresponding to a given bit representation.- Parameters:
bits- anyshortinteger.- Returns:
- the
Float16floating-point value with the same bit pattern. - See Also:
-
compareTo
Compares twoFloat16objects numerically. This method imposes a total order onFloat16objects with two differences compared to the incomplete order defined by the Java language numerical comparison operators (<, <=, ==, >=, >) onfloatanddoublevalues.- A NaN is unordered with respect to other
values and unequal to itself under the comparison
operators. This method chooses to define
Float16.NaNto be equal to itself and greater than all otherFloat16values (includingFloat16.POSITIVE_INFINITY). - Positive zero and negative zero compare equal numerically, but are distinct and distinguishable values. This method chooses to define positive zero to be greater than negative zero.
- Specified by:
compareToin interfaceComparable<Float16>- Parameters:
anotherFloat16- theFloat16to be compared.- Returns:
- the value
0ifanotherFloat16is numerically equal to thisFloat16; a value less than0if thisFloat16is numerically less thananotherFloat16; and a value greater than0if thisFloat16is numerically greater thananotherFloat16. - See Java Language Specification:
-
15.20.1 Numerical Comparison Operators
<,<=,>, and>= - See Also:
- A NaN is unordered with respect to other
values and unequal to itself under the comparison
operators. This method chooses to define
-
compare
Compares the two specifiedFloat16values.- Parameters:
f1- the firstFloat16to comparef2- the secondFloat16to compare- Returns:
- the value
0iff1is numerically equal tof2; a value less than0iff1is numerically less thanf2; and a value greater than0iff1is numerically greater thanf2. - See Also:
-
max
Returns the larger of twoFloat16values. The handling of signed zeros, NaNs, infinities, and other special cases by this method is analogous to the handling of those cases by the Math#max(double, double) method.- API Note:
- This method corresponds to the maximum operation defined in IEEE 754.
- Parameters:
a- the first operandb- the second operand- Returns:
- the greater of
aandb - See Also:
-
min
Returns the smaller of twoFloat16values. The handling of signed zeros, NaNs, infinities, and other special cases by this method is analogous to the handling of those cases by the Math#min(double, double) method.- API Note:
- This method corresponds to the minimum operation defined in IEEE 754.
- Parameters:
a- the first operandb- the second operand- Returns:
- the smaller of
aandb - See Also:
-
add
Adds twoFloat16values together as per the+operator semantics using the round to nearest rounding policy. The handling of signed zeros, NaNs, infinities, and other special cases by this method is the same as for the handling of those cases by the built-in+operator for floating-point addition (JLS 15.18.2).- API Note:
- This method corresponds to the addition operation defined in IEEE 754.
- Parameters:
addend- the first operandaugend- the second operand- Returns:
- the sum of the operands
- See Java Language Specification:
-
15.4 Floating-point Expressions
15.18.2 Additive Operators (+ and -) for Numeric Types
-
subtract
Subtracts twoFloat16values as per the-operator semantics using the round to nearest rounding policy. The handling of signed zeros, NaNs, infinities, and other special cases by this method is the same as for the handling of those cases by the built-in-operator for floating-point subtraction (JLS 15.18.2).- API Note:
- This method corresponds to the subtraction operation defined in IEEE 754.
- Parameters:
minuend- the first operandsubtrahend- the second operand- Returns:
- the difference of the operands
- See Java Language Specification:
-
15.4 Floating-point Expressions
15.18.2 Additive Operators (+ and -) for Numeric Types
-
multiply
Multiplies twoFloat16values as per the*operator semantics using the round to nearest rounding policy. The handling of signed zeros, NaNs, and infinities, other special cases by this method is the same as for the handling of those cases by the built-in*operator for floating-point multiplication (JLS 15.17.1).- API Note:
- This method corresponds to the multiplication operation defined in IEEE 754.
- Parameters:
multiplier- the first operandmultiplicand- the second operand- Returns:
- the product of the operands
- See Java Language Specification:
-
15.4 Floating-point Expressions
15.17.1 Multiplication Operator *
-
divide
Divides twoFloat16values as per the/operator semantics using the round to nearest rounding policy. The handling of signed zeros, NaNs, and infinities, other special cases by this method is the same as for the handling of those cases by the built-in/operator for floating-point division (JLS 15.17.2).- API Note:
- This method corresponds to the division operation defined in IEEE 754.
- Parameters:
dividend- the first operanddivisor- the second operand- Returns:
- the quotient of the operands
- See Java Language Specification:
-
15.4 Floating-point Expressions
15.17.2 Division Operator /
-
sqrt
Returns the square root of the operand. The square root is computed using the round to nearest rounding policy. The handling of zeros, NaN, infinities, and negative arguments by this method is analogous to the handling of those cases byMath.sqrt(double).- API Note:
- This method corresponds to the squareRoot operation defined in IEEE 754.
- Parameters:
radicand- the argument to have its square root taken- Returns:
- the square root of the operand
- See Also:
-
fma
Returns the fused multiply add of the three arguments; that is, returns the exact product of the first two arguments summed with the third argument and then rounded once to the nearestFloat16. The handling of zeros, NaN, infinities, and other special cases by this method is analogous to the handling of those cases byMath.fma(float, float, float).- API Note:
- This method corresponds to the fusedMultiplyAdd operation defined in IEEE 754.
- Parameters:
a- a valueb- a valuec- a value- Returns:
- (a × b + c)
computed, as if with unlimited range and precision, and rounded
once to the nearest
Float16value - See Also:
-
negate
Returns the negation of the argument. Special cases:- If the argument is zero, the result is a zero with the opposite sign as the argument.
- If the argument is infinite, the result is an infinity with the opposite sign as the argument.
- If the argument is a NaN, the result is a NaN.
- API Note:
- This method corresponds to the negate operation defined in IEEE 754.
- Parameters:
f16- the value to be negated- Returns:
- the negation of the argument
- See Java Language Specification:
-
15.15.4 Unary Minus Operator
-
-
abs
Returns the absolute value of the argument. The handling of zeros, NaN, and infinities by this method is analogous to the handling of those cases byMath.abs(float).- Parameters:
f16- the argument whose absolute value is to be determined- Returns:
- the absolute value of the argument
- See Also:
-
getExponent
Returns the unbiased exponent used in the representation of aFloat16.- If the argument is NaN or infinite, then the result is
MAX_EXPONENT+ 1. - If the argument is zero or subnormal, then the result is
MIN_EXPONENT- 1.
- API Note:
- This method is analogous to the logB operation defined in IEEE 754, but returns a different value on subnormal arguments.
- Parameters:
f16- aFloat16value- Returns:
- the unbiased exponent of the argument
- See Also:
- If the argument is NaN or infinite, then the result is
-
ulp
Returns the size of an ulp of the argument. An ulp, unit in the last place, of aFloat16value is the positive distance between this floating-point value and theFloat16value next larger in magnitude. Note that for non-NaN x,ulp(-x) == ulp(x).Special Cases:
- If the argument is NaN, then the result is NaN.
- If the argument is positive or negative infinity, then the result is positive infinity.
- If the argument is positive or negative zero, then the result is
Float16.MIN_VALUE. - If the argument is ±
Float16.MAX_VALUE, then the result is equal to 25, 32.0.
- Parameters:
f16- the floating-point value whose ulp is to be returned- Returns:
- the size of an ulp of the argument
- See Also:
-
nextUp
Returns the floating-point value adjacent tovin the direction of positive infinity.Special Cases:
- If the argument is NaN, the result is NaN.
- If the argument is positive infinity, the result is positive infinity.
- If the argument is zero, the result is
MIN_VALUE
- API Note:
- This method corresponds to the nextUp operation defined in IEEE 754.
- Parameters:
v- starting floating-point value- Returns:
- The adjacent floating-point value closer to positive infinity.
- See Also:
-
nextDown
Returns the floating-point value adjacent tovin the direction of negative infinity.Special Cases:
- If the argument is NaN, the result is NaN.
- If the argument is negative infinity, the result is negative infinity.
- If the argument is zero, the result is
-
MIN_VALUE
- API Note:
- This method corresponds to the nextDown operation defined in IEEE 754.
- Parameters:
v- starting floating-point value- Returns:
- The adjacent floating-point value closer to negative infinity.
- See Also:
-
scalb
Returnsv× 2scaleFactorrounded as if performed by a single correctly rounded floating-point multiply. If the exponent of the result is betweenMIN_EXPONENTandMAX_EXPONENT, the answer is calculated exactly. If the exponent of the result would be larger thanFloat16.MAX_EXPONENT, an infinity is returned. Note that if the result is subnormal, precision may be lost; that is, whenscalb(x, n)is subnormal,scalb(scalb(x, n), -n)may not equal x. When the result is non-NaN, the result has the same sign asv.Special cases:
- If the first argument is NaN, NaN is returned.
- If the first argument is infinite, then an infinity of the same sign is returned.
- If the first argument is zero, then a zero of the same sign is returned.
- API Note:
- This method corresponds to the scaleB operation defined in IEEE 754.
- Parameters:
v- number to be scaled by a power of two.scaleFactor- power of 2 used to scalev- Returns:
v× 2scaleFactor- See Also:
-
copySign
Returns the first floating-point argument with the sign of the second floating-point argument. This method does not require NaNsignarguments to be treated as positive values; implementations are permitted to treat some NaN arguments as positive and other NaN arguments as negative to allow greater performance.- API Note:
- This method corresponds to the copySign operation defined in IEEE 754.
- Parameters:
magnitude- the parameter providing the magnitude of the resultsign- the parameter providing the sign of the result- Returns:
- a value with the magnitude of
magnitudeand the sign ofsign. - See Also:
-
signum
Returns the signum function of the argument; zero if the argument is zero, 1.0 if the argument is greater than zero, -1.0 if the argument is less than zero.Special Cases:
- If the argument is NaN, then the result is NaN.
- If the argument is positive zero or negative zero, then the result is the same as the argument.
- Parameters:
f- the floating-point value whose signum is to be returned- Returns:
- the signum function of the argument
- See Also:
-