2.1. Data Types and Structured Data Types
Let
me begin the course by giving definitions for the terms data type and structured data type.
A data type consists of
- a domain (= a set of values)
- a set of operations.
Example 1: boolean or logical data type provided by
most programming languages.
- two values: true, false.
- many operations, including: AND, OR, NOT, etc.
Example 2: As a second example, consider the datatype fraction. How can we specify the domain and operations that define
fractions? It seems straightforward to name the operations; fractions are
numbers so all the normal arithmetic operations apply, such as addition,
multiplication, comparison. In addition there might be some fraction-specific
operations such as normalizing a fraction by removing common terms from its numerator and
denominator - for example, if we normalized 6/9 we'd get 2/3.
But how do we specify
the domain for fractions, i.e. the set of possible values for a fraction?
2.2. Structural and Behavioural Definitions
There
are two different approaches to specifying a domain: we
can give a structural definition, or we can give a behavioural definition. Let us see what these
two are like.
Structural Definition of the domain for `Fraction'
The value of of a
fraction is made of three parts (or components):
- a sign, which is either + or -
- a numerator, which may be any non-negative integer
- a denominator, which may be any positive integer (not zero, not
negative).
This
is called a structural definition because it defines the values of type
`fraction' by imposing an internal structure on them (they have 3 parts...).
The parts themselves have specific types, and there may be further constraints.
For example, we could have insisted that a fraction's numerator and denominator
have no common divisor (in that case we wouldn't need the normalize operation -
6/9 would not be a fraction by this definition).
Behavioural Definition of the domain for `Fraction'
The alternative approach
to defining the set of values for fractions does not impose any internal
structure on them. Instead it just adds an operation that creates fractions out
of other things, such as
CREATE_FRACTION(N,D)
where N is any integer, D is any non-zero integer.
The
values of type fraction are defined to be the values that are produced by this
function for any valid combination of inputs.
The parameter names were
chosen to suggest its intended behaviour: CREATE_FRACTION(N,D) should return a value
representing the fraction N/D (N for numerator, Dfor denominator).
You are probably
thinking, this is crazy, CREATE_FRACTION could be any old random function, how do we guarantee that CREATE_FRACTION(N,D) actually returns the fraction N/D?
The answer is that we
have to constrain the behaviour of this function, by relating it to the other
operations on fractions. For example, one of the key properties of
multiplication is that:
NORMALIZE
((N/D) * (D/N)) = 1/1
This
turns into a constraint on CREATE_FRACTION:
NORMALIZE (CREATE_FRACTION(N,D)
* CREATE_FRACTION(D,N)) = CREATE_FRACTION(1,1)
So
you see CREATE_FRACTION cannot be any old function, its behaviour is
highly constrained, because we can write down lots and lots of constraints like
this.
And that's the reason I
call this sort of definition behavioural, because the definition is strictly in terms of
a set of operations and constraints or axioms relating the behaviour of the
operations to one another.
In this style of
definition, the domain of a data type - the set of permissible values - plays
an almost negligible role. Any set of values will do, as long as we have an
appropriate set of operations to go along with it.
Which Kind of Definition Should We Use?
In
this course, we will mainly use behavioural definitions. Their main advantage
is that they are more abstract, imposing absolutely no unnecessary or arbitrary constraints on the data type. This makes
them ideal as abstract definitions, which are central to this course.
Behavioural definitions are also the best ones to use if you ever need to
reason abstractly about a data type: we won't stress this in this course, but
in general it is an important reason for using abstract definitions.
We will occasionally use
structural definitions. They are somewhat more intuitive, until you get used to
the behavioural kind, and they are very common in computer science (e.g. many
textbooks use them). Also, because they impose structure on the values they are
quite useful in giving ideas about how to implement a particular data type in a
computer program.
No comments:
Post a Comment