We will now turn to the maps and sets packages. We
will start by considering maps which are more exciting than sets and
begin with ordered maps which are a little simpler and then consider
hashed maps.

Remember that a map
is just a means of getting from a value of one type (the key) to another
type (the element). This is not a one-one relationship. Given a key there
is a unique element (if any), but several keys may correspond to the
same element. A simple example is an array. This is a map from the index
type to the component type. Thus if we have

S: String := "animal";

then this provides a map from integers in the range
1 to 6 to some values of the type Character.
Given an integer such as 3 there is a unique
character 'i' but given a character such as
'a' there might be several corresponding integers
(in this case both 1 and 5).

More interesting examples are where the set of used
key values is quite sparse. For example we might have a store where various
spare parts are held. The parts have a five-digit part number and there
are perhaps twenty racks where they are held identified by a letter.
However, only a handful of the five digit numbers are in use so it would
be very wasteful to use an array with the part number as index. What
we want instead is a container which holds just the pairs that matter
such as (34618, 'F'), (27134, 'C') and so on. We can do this using a
map. We usually refer to the pairs of values as nodes of the map.

There are two maps
packages with much in common. One keeps the keys in order and the other
uses a hash function. Here is the specification of the ordered maps package
generally showing just those facilities common to both.

The generic parameters include the ordering relationship
"<" on the keys and equality
for the elements.

It is assumed that the ordering relationship is well
behaved in the sense that if x < y is true
then y < x is false. We say that two keys
x and y are equivalent
if both x < y and y
< x are false. In other words this defines an equivalence class
on keys. The relationship must also be transitive, that is, if x
< y and y < z are both true then
x < z must also be true.

This concept of an equivalence relationship occurs
throughout the various maps and sets. Sometimes, as here, it is defined
in terms of an order but in other cases, as we shall see, it is defined
by an equivalence function.

It is absolutely vital that the equivalence relations
are defined properly and meet the above requirements. It is not possible
for the container packages to check this and if the operations are wrong
then peculiar behaviour is almost inevitable.

For the convenience
of the user the function Equivalent_Keys is
declared explicitly. It is equivalent to

The equality operation on elements is not so demanding.
It must be symmetric so that x = y and y
= x are the same but transitivity is not required (although cases
where it would not automatically be transitive are likely to be rare).
The operation is only used for the function "="
on the containers as a whole.

Note that Find and similar
operations for maps and sets work in terms of the equivalence relationship
rather than equality as was the case with lists and vectors.

Empty_Map:

No_Element:

The types Map and Cursor
and constants Empty_Map and No_Element
are similar to the corresponding entities in the lists and vectors containers.

These are again similar to the corresponding entities
for lists. Note that two maps are said to be equal if they have the same
number of nodes with equivalent keys (as defined by "<")
whose corresponding elements are equal (as defined by "=").

Container:

Position:

New_Item:

Position:

Process:

Container:

Process:

In this case there is a function Key
as well as a function Element. But there is
no procedure Replace_Key since it would not
make sense to change a key without changing the element as well and this
really comes down to deleting the whole node and then inserting a new
one.

The procedures Query_Element
and Update_Element are slightly different
in that the procedure Process also takes the
key as parameter as well as the element to be read or updated. Note again
that the key cannot be changed. Nevertheless the value of the key is
given since it might be useful in deciding how the update should be performed.
Remember that we cannot get uniquely from an element to a key but only
from a key to an element.

This moves the map from the source to the target
after first clearing the target. It does not make copies of the nodes
so that after the operation the source is empty and Length(Source)
is zero.

Container:

Key:

New_Item:

Position:

Inserted:

Container:

Key:

Position:

Inserted:

Container:

Key:

New_Item:

These insert a new node into the map unless a node
with an equivalent key already exists. If it does exist then the first
two return with Inserted set to False
and Position indicating the node whereas the
third raises Constraint_Error (the element
value is not changed). If a node with equivalent key is not found then
a new node is created with the given key, the element value is set to
New_Item when that is given and otherwise
it takes its default value (if any), and Position
is set when given.

Unlike vectors and lists, we do not have to say where
the new node is to be inserted because of course this is an ordered map
and it just goes in the correct place according to the order given by
the generic parameter "<".

Container:

Key:

New_Item:

This is somewhat like the last Insert
except that if an existing node with an equivalent key is found then
it is replaced (rather than raising Constraint_Error).
Note that both the key and the element are updated. This is because equivalent
keys might not be totally equal.

For example the key
part might be a record with part number and year of introduction, thus

Part_Number: Integer;

Year: Integer;

and we might define
the ordering relationship to be used as the generic parameter simply
in terms of the part number

In this situation, the keys could match without the
year component being the same and so it would need to be updated. In
other words with this definition of the ordering relation, two keys are
equivalent provided just the part numbers are the same.

Container:

Key:

New_Item:

In this case, Constraint_Error
is raised if the node does not already exist. On replacement both the
key and the element are updated as for Include.

Perhaps a better example of equivalent keys not being
totally equal is if the key were a string. We might decide that the case
of letter did not need to match in the test for equivalence but nevertheless
we would probably want to update with the string as used in the parameter
of Replace.

If there is a node with an equivalent key then it
is deleted. If there is not then nothing happens.

These delete a node. In the first case if there is
no such equivalent key then Constraint_Error
is raised (by contrast to Exclude which remains
silent in this case). In the second case if the cursor is No_Element
then again Constraint_Error is raised –
there is also a check to ensure that the cursor otherwise does designate
a node in the correct map (remember that cursors designate both an entity
and the container); if this check fails then Program_Error
is raised.

Perhaps it is worth observing that Insert,
Include, Replace,
Exclude and Delete
form a sort of progression from an operation that will insert something,
through operations that might insert, will neither insert nor delete,
might delete, to the final operation that will delete something. Note
also that Include, Replace
and Exclude do not apply to lists and vectors.

These should be self-evident. Unlike the operations
on vectors and lists, Find logically searches
the whole map and not just starting at some point (and since it searches
the whole map there is no point in having Reverse_Find).
(In implementation terms it won't actually search the whole map because
it will be structured in a way that makes this unnecessary – as
a balanced tree perhaps.) Moreover, Find uses
the equivalence relation based on the "<"
parameter so in the example it only has to match the part number and
not the year. The function call Element(My_Map, My_Key)
is equivalent to Element(Find(My_Map, My_Key)).

Container:

Process:

These are also as for other containers.

And at last we have

... --

We have omitted to mention quite a few operations
that have no equivalent in hashed maps – we will come back to these
in a moment.

As an example we can
make a container to hold the information concerning spare parts. We can
use the type Part_Key and the function "<"
as above. We can suppose that the element type is

Shelf: Character

Stock: Integer;

This gives both the shelf letter and the number in
stock.

We can then declare
the container thus

Key_Type => Part_Key,

Element_Type => Stock_Info,

"<" => "<");

The_Store: Store_Maps.Map;

The last parameter could be omitted because the formal
has a <> default.

We can now add items
to our store by calling

The_Store.Insert((34618, 1998), ('F', 25));

The_Store.Insert((27134, 2004), ('C', 45));

...

The_Store.Insert((27134, 2004), ('C', 45));

...

We might now have a procedure which, given a part
number, checks to see if it exists and that the stock is not zero, and
if so returns the shelf letter and year number and decrements the stock
count.

Part:

Year:

C: Cursor;

K: Part_Key;

E: Stock_Info;

C := The_Store.Find((Part, 0));

OK := False;

E := Element(C); K := Key(C);

Year := K.Year; Shelf := E.Shelf;

OK := False;

Replace_Element(C, (Shelf, E.Stock–1));

OK := True;

Note that we had to
put a dummy year number in the call of Find.
We could of course use the new <> notation
for this

C := The_Store.Find((Part, **others** => <>));

The reader can improve this example at leisure –
by using Update_Element for example.

As another example
suppose we wish to check all through the stock looking for parts whose
stock is low, perhaps less than some given parameter. We can use Iterate
for this as follows

--

Put("Low stock of part ");

Put_Line(Key(C).Part_Number);

The_Store.Iterate(Check_It'Access);

Note that this uses a so-called downward closure.
The procedure Check_It has to be declared
locally to Check_Stock in order to access
the parameter Low. (Well you could declare
it outside and copy the parameter Low to a
global variable but that is just the sort of wicked thing one has to
do in lesser languages (such as even Ada 95). It is not task safe for
one thing.)

Another approach is
to use First and Next
and so on thus

C: Cursor := The_Store.First;

--

Put("Low stock of part ");

Put_Line(Key(C).Part_Number);

C := The_Store.Next(C);

We will now consider hashed maps. The trouble with
ordered maps in general is that searching can be slow when the map has
many entries. Techniques such as a binary tree can be used but even so
the search time will increase at least as the logarithm of the number
of entries. A better approach is to use a hash function. This will be
familiar to many readers (especially those who have written compilers).
The general idea is as follows.

We define a function which takes a key and returns
some value in a given range. In the case of the Ada containers it has
to return a value of the modular type Hash_Type
which is declared in the root package Ada.Containers.
We could then convert this value onto a range representing an index into
an array whose size corresponds to the capacity of the map. This index
value is the preferred place to store the entry. If there already is
an entry at this place (because some other key has hashed to the same
value) then a number of approaches are possible. One way is to create
a list of entries with the same index value (often called buckets); another
way is simply to put it in the next available slot. The details don't
matter. But the overall effect is that provided the map is not too full
and the hash function is good then we can find an entry almost immediately
more or less irrespective of the size of the map.

So as users all we have to do is to define a suitable
hash function. It should give a good spread of values across the range
of Hash_Type for the population of keys, it
should avoid clustering and above all for a given key it must *always*
return the same hash value. A good discussion on hash functions by Knuth
will be found in [10].

Defining good hash
functions needs care. In the case of the part numbers we might multiply
the part number by some obscure prime number and then truncate the result
down to the modular type Hash_Type. The author
hesitates to give an example but perhaps

M31:

On reflection that's probably a very bad prime to
use because it is so close to half of 2**32
a typical value of Hash_Type'Last+1. Of course
it doesn't have to be prime but simply relatively prime to it such as
5**13. Knuth suggests dividing the range by
the golden number τ = (√5+1)/2 = 1.618... and then taking
the nearest number relatively prime which is in fact simply the nearest
odd number (in this case it is 2654435769).

Here is a historic interlude. Marin Mersenne (1588-1648)
was a Franciscan monk who lived in Paris. He studied numbers of the form
*M*_{p}
= 2^{p}
– 1 where *p* is prime. A lot of these are themselves prime.
Mersenne gave a list of those up to 257 which he said were prime (namely
2, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257). It was not until 1947 that
it was finally settled that he got some of them wrong (61, 89, and 107
are also prime but 67 and 257 are not). At the time of writing there
are 44 known Mersenne primes and the largest which is also the largest
known prime number is *M*_{32582657}
– see www.mersenne.org.

The differences from the ordered maps package are
that there is an extra generic parameter Hash
giving the hash function and the ordering parameter "<"
has been replaced by the function Equivalent_Keys.
It is this function that defines the equivalence relationship for hashed
maps; it is important that Equivalent_Keys(X, Y)
is always the same as Equivalent_Keys(Y, X).
Moreover if X and Y
are equivalent and Y and Z
are equivalent then X and Z
must also be equivalent.

Note that the function Equivalent_Keys
in the ordered maps package discussed above corresponds to the formal
generic parameter of the same name in this hashed maps package. This
should make it easier to convert between the two forms of packages.

Returning to our example,
if we now write

then we can instantiate
the hashed maps package as follows

Key_Type => Part_Key,

Element_Type => Stock_Info,

Hash => Part_Hash,

Equivalent_Keys => Equivalent_Parts);

The_Store: Store_Maps.Map;

and then the rest of our example will be exactly
as before. It is thus easy to convert from an ordered map to a hashed
map and vice versa provided of course that we only use the facilities
common to both.

We will finish this discussion of maps by briefly
considering the additional facilities in the two packages.

The ordered maps package
has the following additional subprograms

Container:

Process:

These are again largely self-evident. The functions
Floor and Ceiling
are interesting. Floor searches for the last
node whose key is not greater than Key and
similarly Ceiling searches for the first node
whose key is not less than Key – they
return No_Element if there is no such element.
The subprograms Previous are of course the
opposite of Next and Reverse_Iterate
is like Iterate only backwards.

The functions "<"
and ">" are mostly for convenience.
Thus the first is equivalent to

Clearly these additional operations must be avoided
if we wish to retain the option of converting to a hashed map later.

Hashed maps have a
very important facility not in ordered maps which is the ability to specify
a capacity as for the vectors package. (Underneath their skin the hashed
maps are a bit like vectors whereas the ordered maps are a bit like lists.)
Thus we have

The behaviour is much as for vectors. We don't have
to set the capacity ourselves since it will be automatically extended
as necessary but it might significantly improve performance to do so.
In the case of maps, increasing the capacity requires the hashing to
be redone which could be quite time consuming, so if we know that our
map is going to be a big one, it is a good idea to set an appropriate
capacity right from the beginning. Note again that Length(M)
cannot exceed Capacity(M) but might be much
less.

The other additional
subprograms for hashed maps are

These (like the additional
"<" and ">"
for ordered maps) are again mostly for convenience. The first is equivalent
to

Before moving on to
sets it should be noticed that there are also some useful functions in
the string packages. The main one is

There is a similar function Ada.Strings.Unbounded.Hash
where the parameter Key has type Unbounded_String.
It simply converts the parameter to the type String
and then calls Ada.Strings.Hash. There is
also a generic function for bounded strings which again calls the basic
function Ada.Strings.Hash. For completeness
the function Ada.Strings.Fixed.Hash is a renaming
of Ada.Strings.Hash.

These are provided because it is often the case that
the key is a string and they save the user from devising good hash functions
for strings which might cause a nasty headache.

We could for example
save ourselves the worry of defining a good hash function in the above
example by making the part number into a 5-character string. So we might
write

and if this doesn't work well then we can blame the
vendor.

© 2005, 2006, 2007 John Barnes Informatics.

Sponsored in part by:

The Ada Resource Association and its member companies: |
and Ada-Europe: |