Contents Index Search Previous Next

1

In the strict mode, the predefined operations
of a floating point type shall satisfy the accuracy requirements specified
here and shall avoid or signal overflow in the situations described.
This behavior is presented in terms of a model of floating point arithmetic
that builds on the concept of the canonical form (see A.5.3).

2

Associated with each floating point type is an
infinite set of model numbers. The model numbers of a type are used to
define the accuracy requirements that have to be satisfied by certain
predefined operations of the type; through certain attributes of the
model numbers, they are also used to explain the meaning of a user-declared
floating point type declaration. The model numbers of a derived type
are those of the parent type; the model numbers of a subtype are those
of its type.

3

The *model numbers* of
a floating point type T are zero and all the values expressible in the
canonical form (for the type T), in which *mantissa* has T'Model_Mantissa
digits and *exponent* has a value greater than or equal to T'Model_Emin.
(These attributes are defined in G.2.2.)

4

A *model interval* of
a floating point type is any interval whose bounds are model numbers
of the type. The *model interval* of a type
T *associated with a value* *v* is the smallest model interval
of T that includes *v*. (The model interval associated with a model
number of a type consists of that number only.)

5

The accuracy requirements for the evaluation of
certain predefined operations of floating point types are as follows.

6

An *operand interval*
is the model interval, of the type specified for the operand of an operation,
associated with the value of the operand.

7

For any predefined
arithmetic operation that yields a result of a floating point type T,
the required bounds on the result are given by a model interval of T
(called the *result interval*) defined in terms of the operand values
as follows:

8

- The result interval is the smallest model interval of T that includes the minimum and the maximum of all the values obtained by applying the (exact) mathematical operation to values arbitrarily selected from the respective operand intervals.

9

The result interval of an exponentiation is obtained
by applying the above rule to the sequence of multiplications defined
by the exponent, assuming arbitrary association of the factors, and to
the final division in the case of a negative exponent.

10

The result interval of a conversion of a numeric
value to a floating point type T is the model interval of T associated
with the operand value, except when the source expression is of a fixed
point type with a *small* that is not a power of T'Machine_Radix
or is a fixed point multiplication or division either of whose operands
has a *small* that is not a power of T'Machine_Radix; in these cases,
the result interval is implementation defined.

11

For any
of the foregoing operations, the implementation shall deliver a value
that belongs to the result interval when both bounds of the result interval
are in the safe range of the result type T, as determined by the values
of T'Safe_First and T'Safe_Last; otherwise,

12

- if T'Machine_Overflows is True, the implementation shall either deliver a value that belongs to the result interval or raise Constraint_Error;

13

- if T'Machine_Overflows is False, the result is implementation defined.

14

For any predefined relation on operands of a
floating point type T, the implementation may deliver any value (i.e.,
either True or False) obtained by applying the (exact) mathematical comparison
to values arbitrarily chosen from the respective operand intervals.

15

The result of a membership test is defined in
terms of comparisons of the operand value with the lower and upper bounds
of the given range or type mark (the usual rules apply to these comparisons).

16

If the underlying floating point hardware implements
division as multiplication by a reciprocal, the result interval for division
(and exponentiation by a negative exponent) is implementation defined.