Data Types and Data Items

2


This chapter describes the data types and data structures in Sun FORTRAN 77: Types, Constants, Variables, Arrays, Substrings, Structures, and Pointers.


Types

Except for specifically typeless constants, any constant, constant expression, variable, array, array element, substring, or function usually represents typed data.

On the other hand, data types are not associated with the names of programs or subroutines, block data routines, common blocks, namelist groups, or structured records.

Rules for Data Typing

The name determines the type; that is, the name of a datum or function determines its data type, explicitly or implicitly, according to the following rules of data typing;

Array Elements

An array element has the same type as the array name.

Functions

Each intrinsic function has a specified type. An intrinsic function does not require an explicit type statement, but that is allowed. A generic function does not have a predetermined type; the type is determined by the type of the arguments, as shown in Chapter 6, "Intrinsic Functions."

An external function can have its type specified in any of the following ways:

Example: Explicitly by putting its name in a type statement:

	FUNCTION F ( X ) 
	INTEGER F, X 
	F = X + 1 
	RETURN 
	END

Example: Explicitly in its FUNCTION statement:

	INTEGER FUNCTION F ( X ) 
	INTEGER X 
	F = X + 1 
	RETURN 
	END 

Example: Implicitly by its name, as with variables:

	FUNCTION NXT ( X ) 
	INTEGER X 
	NXT = X + 1 
	RETURN 
	END 

Implicit typing can affect the type of a function, either by default implicit typing or by an IMPLICIT statement. You must make the data type of the function be the same within the function subprogram as it is in the calling program unit. FORTRAN does no type checking between program units.

Properties of Data Types

This section describes the data types, what each is for, the way storage is allocated for each of them, and the alignment of the different types. Storage and alignment are always given in bytes. Values that can fit into a single byte are byte-aligned.

Default data alignment and sizes may be changed by compiling with special options, such as -f, -dalign, -dbl_align_all, -dbl, -r8, -i2, and
-xtypemap. The descriptions in this manual in general assume that these options are not in force.

Default data declarations, those that do not explicitly declare a data size, such as REAL A, INTEGER B, COMPLEX C, LOGICAL D, DOUBLEPRECISION E, have their meanings changed by these options, along with data not explicitly declared. Data alignment is also platform dependent.

Refer to the Fortran User's Guide for details.

BYTE

The BYTE data type provides a data type that uses only one byte of storage. It is a logical data type, and has the synonym, LOGICAL*1.

A variable of type BYTE can hold any of the following:

If it is interpreted as a logical value, a value of 0 represents .FALSE., and any other value is interpreted as .TRUE.

f77 allows the BYTE type as an array index, just as it allows the REAL type, but it does not allow BYTE as a DO loop index (where it allows only INTEGER, REAL, and DOUBLE PRECISION). Wherever FORTRAN makes an explicit check for INTEGER, it does not allow BYTE.

Examples:

	BYTE 	Bit3 / 8 /, C1 / 'W' /, 
&	 	Counter / 0 /, Switch / .FALSE. / 

A BYTE item occupies 1 byte of storage, and is aligned on 1-byte boundaries.

CHARACTER

The character data type, CHARACTER, which has the synonym, CHARACTER*1, holds one character.

The character is enclosed in apostrophes (') or quotes ("). Allowing quotes (") is nonstandard; if you compile with the -xl option, quotes mean something else, and you must use apostrophes to enclose a string.

The data of type CHARACTER is always unsigned.

A CHARACTER item occupies 1 byte (8 bits) of storage.

A CHARACTER item is aligned on 1-byte boundaries.

CHARACTER*n

The character string data type, CHARACTER*n, where n > 0, holds a string of n characters.

A CHARACTER*n data type occupies n bytes of storage.

A CHARACTER*n variable is aligned on 1-byte boundaries.

Every character string constant is aligned on 2-byte boundaries. If it does not appear in a DATA statement, it is followed by a null character to ease communication with C routines.

COMPLEX

A complex datum is an approximation of a complex number. The complex data type, COMPLEX, which defaults to a synonym for COMPLEX*8, is a pair of REAL*4 values that represent a complex number. The first element represents the real part and the second represents the imaginary part.

The default size for a COMPLEX item (no size specified) is 8. The default alignment is on 4-byte boundaries. However, these defaults can be changed by compiling with certain special options.

COMPLEX*8

The complex data type COMPLEX*8 is a synonym for COMPLEX, except that it always has a size of 8 bytes, independent of any compiler options.

COMPLEX*16 (Double Complex)

The complex data type COMPLEX*16 is a synonym for DOUBLE COMPLEX, except that it always has a size of 16 bytes, independent of any compiler options.

COMPLEX*32 (Quad Complex)

(SPARC, PowerPC only) The complex data type COMPLEX*32 is a quadruple-precision complex. It is a pair of REAL*16 elements, where each has a sign bit, a 15-bit exponent, and a 112-bit fraction. These REAL*16 elements in f77 conform to the IEEE standard.

The size for COMPLEX*32 is 32 bytes.

COMPLEX*32 is aligned on 4-byte boundaries, except if compiled on a Sun-4 or SPARC computer with the -f option, in which case it is aligned on 8-byte boundaries.

DOUBLE COMPLEX

The complex data type, DOUBLE COMPLEX, which usually has the synonym, COMPLEX*16, is a pair of DOUBLE PRECISION (REAL*8) values that represents a complex number. The first element represents the real part; the second represents the imaginary part.

The default size for DOUBLE COMPLEX with no size specified is 16. COMPLEX*16 is aligned on 4-byte boundaries. However, these defaults can be changed by compiling with certain special options.

DOUBLE PRECISION

A double-precision datum is an approximation of a real number. The double-precision data type, DOUBLE PRECISION, which has the synonym, REAL*8, holds one double-precision datum.

The default size for DOUBLE PRECISION with no size specified is 8. DOUBLE PRECISION is aligned on 4-byte boundaries. However, these defaults can be changed by compiling with certain special options.

A DOUBLE PRECISION element has a sign bit, an 11-bit exponent, and a 52-bit fraction. These DOUBLE PRECISION elements in f77 conform to the IEEE standard for double-precision floating-point data. The layout is shown in Appendix C, "Data Representations."

INTEGER

The integer data type, INTEGER, holds a signed integer.

The default size for INTEGER with no size specified is 4, and is aligned on 4-byte boundaries. However, these defaults can be changed by compiling with certain special options.

INTEGER*2

The short integer data type, INTEGER*2, holds a signed integer. An expression involving only objects of type INTEGER*2 is of that type. Using this feature may have adverse performance implications, and we do not recommend it.

Generic functions return short or long integers depending on the default integer type. If a procedure is compiled with the -i2 flag, all integer constants that fit and all variables of type INTEGER (no explicit size) are of type INTEGER*2. If the precision of an integer-valued intrinsic function is not determined by the generic function rules, one is chosen that returns the prevailing length (INTEGER*2) when the -i2 compilation option is in effect. With -i2, the default length of LOGICAL quantities is 2 bytes.

Ordinary integers follow the FORTRAN rules about occupying the same space as a REAL variable. They are assumed to be equivalent to the C type long int, and 1-byte integers are of C type short int. These short integer and logical quantities do not obey the standard rules for storage association.

An INTEGER*2 occupies 2 bytes.

INTEGER*2 is aligned on 2-byte boundaries.

INTEGER*4

The integer data type, INTEGER*4, holds a signed integer.

An INTEGER*4 occupies 4 bytes.

INTEGER*4 is aligned on 4-byte boundaries.

INTEGER*8

The integer data type, INTEGER*8, holds a signed 64-bit integer.

An INTEGER*8 occupies 8 bytes.

INTEGER*8 is aligned on 8-byte boundaries.

LOGICAL

The logical data type, LOGICAL, holds a logical value .TRUE. or .FALSE. The value 0 represents .FALSE.; any other value represents .TRUE.

The usual default size for an LOGICAL item with no size specified is 4, and is aligned on 4-byte boundaries. However, these defaults can be changed by compiling with certain special options.

LOGICAL*1

The one-byte logical data type, LOGICAL*1, which has the synonym, BYTE, can hold any of the following:

The value is as defined for LOGICAL, but it can hold a character or small integer. An example:

	LOGICAL*1 		Bit3 / 8 /, C1 / 'W' /, 
& 			Counter / 0 /, Switch / .FALSE. / 

A LOGICAL*1 item occupies one byte of storage.

LOGICAL*1 is aligned on one-byte boundaries.

LOGICAL*2

The data type, LOGICAL*2, holds logical value .TRUE. or .FALSE. The value is defined as for LOGICAL.

A LOGICAL*2 occupies 2 bytes.

LOGICAL*2 is aligned on 2-byte boundaries.

LOGICAL*4

The logical data type, LOGICAL*4 holds a logical value .TRUE. or .FALSE. The value is defined as for LOGICAL.

A LOGICAL*4 occupies 4 bytes.

LOGICAL*4 is aligned on 4-byte boundaries.

LOGICAL*8

The logical data type, LOGICAL*8, holds the logical value .TRUE. or .FALSE. This data type is allowed only if the -dbl option is set. The value is defined the same way as for the LOGICAL data type.

A LOGICAL*8 occupies 8 bytes.

LOGICAL*8 is aligned on 8-byte boundaries.

REAL

A real datum is an approximation of a real number. The real data type, REAL, which usually has the synonym, REAL*4, holds one real datum.

The usual default size for a REAL item with no size specified is 4 bytes, and is aligned on 4-byte boundaries. However, these defaults can be changed by compiling with certain special options.

A REAL element has a sign bit, an 8-bit exponent, and a 23-bit fraction. These REAL elements in f77 conform to the IEEE standard.

REAL*4

The REAL*4 data type is a synonym for REAL, except that it always has a size of 4 bytes, independent of any compiler options.

REAL*8 (Double-Precision Real)

The REAL*8, data type is a synonym for DOUBLE PRECISION, except that it always has a size of 8 bytes, independent of any compiler options.

REAL*16 (Quad Real)

(SPARC, PowerPC only) The REAL*16 data type is a quadruple-precision real.The size for a REAL*16 item is 16 bytes. A REAL*16 element has a sign bit, a 15-bit exponent, and a 112-bit fraction. These REAL*16 elements in f77 conform to the IEEE standard for extended precision.

Size and Alignment Summary

The size and alignment of types depends on various compiler options and platforms. Table 2-1 summarizes the default size and alignment, ignoring other aspects of types and options.

Table  2-1 Default Data Sizes and Alignments

Fortran 77 Data Type
Size (Bytes)
Alignment in (Bytes)
SPARC &
PowerPC Intel
BYTE X

CHARACTER X

CHARACTER*n X

1

1

n

1

1

1

1

1

1

COMPLEX X =COMPLEX*8

COMPLEX*8 X

DOUBLE COMPLEX X =COMPLEX*16

COMPLEX*16 X

COMPLEX*32 X "Quad Complex"

8

8

16

16

32

4

4

8

8

8

4

4

4

4

--

DOUBLE PRECISION X =REAL*8

REAL X =REAL*4

REAL*4 X

REAL*8 X

REAL*16 X "Quad Real"

8

4

4

8

16

8

4

4

8

8

4

4

4

4

--

INTEGER X =INTEGER*4

INTEGER*2 X

INTEGER*4 X

INTEGER*8 X

4

2

4

8

4

2

4

8

4

2

4

4

LOGICAL X =LOGICAL*4

LOGICAL*1 X =BYTE

LOGICAL*2 X

LOGICAL*4 X

LOGICAL*8 X

4

1

2

4

8

4

1

2

4

8

4

1

2

4

4

Note the following:

Compiling with options -i2,-r8, or -dbl changes the defaults for certain data declarations that appear without an explicit size:

Table  2-2 Data Defaults Changed by -i2, -r8, -dbl

Default Type
With -i2
With -r8 or -dbl
INTEGER
INTEGER*2
INTEGER*8
LOGICAL
LOGICAL*2
LOGICAL*8
REAL
REAL*4
REAL*8
DOUBLE
REAL*8
REAL*16
COMPLEX
COMPLEX*8
COMPLEX*16
DOUBLE COMPLEX
COMPLEX*16
COMPLEX*32

Do not combine -i2 with -r8 as this can produce unexpected results. REAL*16 and COMPLEX*32 are SPARC and PowerPC only.

With -dbl or -r8, INTEGER and LOGICAL are allocated the larger space indicated above. This is done to maintain the FORTRAN requirement that an integer item and a real item have the same amount of storage. However, with
-r8 8 bytes are allocated but only 4-byte arithmetic is done. With -dbl,
8 bytes are allocated and full 8-byte arithmetic is done. In all other ways, -dbl and -r8 produce the same results.

Options -f or -dalign (SPARC only) force alignment of all 8, 16, or 32-byte data onto 8-byte boundaries. Option -dbl_align_all causes all data to be aligned on 8-byte boundaries. Programs that depend on the use of these options may not be portable.

See the Fortran User's Guide for details.


Constants

A constant is a datum whose value cannot change throughout the program unit. The form of the string representing a constant determines the value and data type of the constant.

There are three general kinds of constants:

Blank characters within an arithmetic or logical constant do not affect the value of the constant. Within character constants, they do affect the value.

Here are the different kinds of arithmetic constants:

Typed Constants
Typeless Constants
Complex Binary
Double complex Octal
Double precision Hexadecimal
Integer Hollerith
Real  

A signed constant is an arithmetic constant with a leading plus or minus sign. An unsigned constant is an arithmetic constant without a leading sign.

For integer, real, and double-precision data, zero is neither positive nor negative. The value of a signed zero is the same as that of an unsigned zero.

Compiling with any of the options -i2, -dbl, -r8, or -xtypemap alters the default size of integer, real, complex, and double precision constants. These options are described in Chapter 2, and in the Fortran User's Guide.

Character Constants

A character-string constant is a string of characters enclosed in apostrophes or quotes. The apostrophes are standard; the quotes are not.

If you compile with the -xl option, then the quotes mean something else, and you must use apostrophes to enclose a string.

To include an apostrophe in an apostrophe-delimited string, repeat it. To include a quote in a quote-delimited string, repeat it. Examples:

'abc'			"abc" 
'ain''t'			"in vi type ""h9Y"

If a string begins with one kind of delimiter, the other kind can be embedded within it without using the repeated quote or backslash escapes. See Table 2-3.

Example: Character constants:

"abc" 		"abc"
"ain't"		'in vi type "h9Y'

Null Characters

Each character string constant appearing outside a DATA statement is followed by a null character to ease communication with C routines. You can make character string constants consisting of no characters, but only as arguments being passed to a subprogram. Such zero length character string constants are not FORTRAN standard.

Example: Null character string:

demo% cat NulChr.f 
	write(*,*) 'a', '', 'b'
	stop
	end
demo% f77 NulChr.f
NulChr.f:
 MAIN:
demo% a.out
ab
demo%

However, if you put such a null character constant into a character variable, the variable will contain a blank, and have a length of at least 1 byte.

Example: Length of null character string:

demo% cat NulVar.f 
	character*1 x / 'a' /, y / '' /, z / 'c' / 
	write(*,*) x, y, z 
	write(*,*) len( y ) 
	end 
demo% f77 NulVar.f 
NulVar.f: 
 MAIN: 
demo% a.out 
a c 
  1 
demo% 

Escape Sequences

For compatibility with C usage, the following backslash escapes are recognized. If you include the escape sequence in a character string, then you get the indicated character.

Table  2-3 Backslash Escape Sequences

Escape Sequence
Character
\n Newline
\r Carriage return
\t Tab
\b Backspace
\f Form feed
\v Vertical tab
\0 Null
\' Apostrophe, which does not terminate a string
\" Quotation mark, which does not terminate a string
\\ \
\x x, where x is any other character

If you compile with the -xl option, then the backslash character (\) is treated as an ordinary character. That is, with the -xl option, you cannot use these escape sequences to get special characters.

Technically, the escape sequences are not nonstandard, but are implementation- defined.

Complex Constants

A complex constant is an ordered pair of real or integer constants. The constants are separated by a comma, and the pair is enclosed in parentheses. The first constant is the real part, and the second is the imaginary part. A complex constant, COMPLEX*8, uses 8 bytes of storage.

Example: Complex constants:

( 9.01, .603 ) 
( +1.0, -2.0 ) 
( +1.0, -2 ) 
( 1, 2 ) 
( 4.51, )    Invalid --need second part

COMPLEX*16 Constants

A double-complex constant, COMPLEX*16, is an ordered pair of real or integer constants, where one of the constants is REAL*8, and the other is INTEGER, REAL*4, or REAL*8.

The constants are separated by a comma, and the pair is enclosed in parentheses. The first constant is the real part, and the second is the imaginary part. A double-complex constant, COMPLEX*16, uses 16 bytes of storage.

Example: Double-complex constants:

( 9.01D6, .603 ) 
( +1.0, -2.0D0 ) 
( 1D0, 2 ) 
( 4.51D6, )      Invalid--need second part 
( +1.0, -2.0 )   Not DOUBLE COMPLEX--need a REAL*8

COMPLEX*32 (Quad Complex) Constants

(SPARC, PowerPC only) A quad complex constant is an ordered pair of real or integer constants, where one of the constants is REAL*16, and the other is INTEGER, REAL*4, REAL*8, or REAL*16.

The constants are separated by a comma, and the pair is enclosed in parentheses. The first constant is the real part, and the second is the imaginary part. A quad complex constant, COMPLEX*32 , uses 32 bytes of storage.

Example: Quad complex constants (SPARC, PowerPC only):

( 9.01Q6, .603 ) 
( +1.0, -2.0Q0 ) 
( 1Q0, 2 ) 
( 3.3Q-4932, 9 ) 
( 1, 1.1Q+4932 ) 
( 4.51Q6, )      Invalid--need second part 
( +1.0, -2.0 )   Not quad complex --need a REAL*16

Integer Constants

An integer constant consists of an optional plus or minus sign, followed by a string of decimal digits.

Restrictions

No other characters are allowed except, of course, a space.

If no sign is present, the constant is assumed to be nonnegative.

The value must be in the range (-2147483648, 2147483647).

Compiling with the -dbl or -r8 option alters the range to:
(-9223372036854775808,9223372036854775807).

Example: Integer constants:

-2147483648 
-2147483649    Invalid--too small, error message 
-10 
0 
+199
29002 
2.71828        Not INTEGER--decimal point not allowed 
1E6            Not INTEGER--E not allowed 
29,002         Invalid--comma not allowed, error message 
2147483647 
2147483648     Invalid-- too large, error message 

Alternate Octal Notation

You can also specify integer constants with the following alternate octal notation. Precede an integer string with a double quote (") and compile with the -xl option. These are octal constants of type INTEGER.

Example: The following two statements are equivalent:

	JCOUNT = ICOUNT + "703 
	JCOUNT = ICOUNT + 451

You can also specify typeless constants as binary, octal, hexadecimal, or Hollerith. See "Typeless Constants (Binary, Octal, Hexadecimal)" on page 34.

Long Integers

Compiling with the -dbl or -r8 option alters the range from
(-21474836, 21474836) to (-9223372036854775808, 9223372036854775807). The integer constant is stored or passed as an 8-byte integer, data type INTEGER*8.

Short Integers

If a constant argument is in the range (-32768, 32767), it is usually widened to a 4-byte integer, data type INTEGER*4; but compiling with the -i2 option will cause it to be stored or passed as a 2-byte integer, data type INTEGER*2.

Logical Constants

A logical constant is either the logical value true or false. The only logical constants are .TRUE. and .FALSE.; no others are possible. The period delimiters are necessary.

A logical constant takes 4 bytes of storage. If it is an actual argument, it is passed as 4 bytes, unless compiled with the -i2 option, in which case it is passed as 2.

Real Constants

A real constant is an approximation of a real number. It can be positive, negative, or zero. It has a decimal point or an exponent. If no sign is present, the constant is assumed to be nonnegative.

Real constants, REAL*4, use 4 bytes of storage.

Basic Real Constant

A basic real constant consists of an optional plus or minus sign, followed by an integer part, followed by a decimal point, followed by a fractional part.

The integer part and the fractional part are each strings of digits, and you can omit either of these parts, but not both.

Example: Basic real constants:

+82.
-32.
90. 
98.5

Real Exponent

A real exponent consists of the letter E, followed by an optional plus or minus sign, followed by an integer.

Example: Real exponents:

E+12 
E-3 
E6

Real Constant

A real constant has one of these forms:

A real exponent denotes a power of ten. The value of a real constant is the product of that power of ten and the constant that precedes the E.

Example: Real constants:

-32. 
-32.18 
1.6E-9 
7E3 
1.6E12 
$1.0E2.0        Invalid-- $ not allowed, error message 
82              Not REAL--need decimal point or exponent 
29,002.0        Invalid --comma not allowed, error message 
1.6E39          Invalid--too large, machine infinity is used 
1.6E-39         Invalid --too small, some precision is lost 

The restrictions are:

REAL*8 (Double-Precision Real) Constants

A double-precision constant is an approximation of a real number. It can be positive, negative, or zero. If no sign is present, the constant is assumed to be nonnegative. A double-precision constant has a double-precision exponent and an optional decimal point. Double-precision constants, REAL*8, use 8 bytes of storage. The REAL*8 notation is nonstandard.

Double-Precision Exponent

A double-precision exponent consists of the letter D, followed by an optional plus or minus sign, followed by an integer.

A double-precision exponent denotes a power of 10. The value of a double-precision constant is the product of that power of 10 and the constant that precedes the D. The form and interpretation are the same as for a real exponent, except that a D is used instead of an E.

Examples of double-precision constants are:

1.6D-9 
7D3 
$1.0D2.0        Invalid --$ not allowed, error message
82              Not DOUBLE PRECISION--need decimal point or exponent
29,002.0D0      Invalid--comma not allowed, error message
1.8D308         Invalid--too large, machine infinity is used
1.0D-324        Invalid--too small, some precision is lost

The restrictions are:

REAL*16 (Quad Real) Constants

(SPARC, PowerPC only) A quadruple-precision constant is a basic real constant (see the start of the section, "Real Constants" on page 31), or an integer constant, such that it is followed by a quadruple-precision exponent.

A quadruple-precision exponent consists of the letter Q, followed by an optional plus or minus sign, followed by an integer.

A quadruple-precision constant can be positive, negative, or zero. If no sign is present, the constant is assumed to be nonnegative.

Example: Quadruple-precision constants:

1.6Q-9 
7Q3 
3.3Q-4932 
1.1Q+4932 
$1.0Q2.0        Invalid--$ not allowed, error message
82              Not quad--need exponent
29,002.0Q0      Invalid--comma not allowed, error message
1.6Q5000        Invalid--too large, machine infinity is used
1.6Q-5000       Invalid--too small, some precision is lost

The form and interpretation are the same as for a real constant, except that a Q is used instead of an E.

The restrictions are:

Typeless Constants (Binary, Octal, Hexadecimal)

Typeless numeric constants are so named because their expressions assume data types based on how they are used.

These constants are not converted before use. However, in f77, they must be distinguished from character strings.

The general form is to enclose a string of appropriate digits in apostrophes and prefix it with the letter B, O, X, or Z. The B is for binary, the O is for octal, and the X or Z are for hexadecimal.

Example: Binary, octal, and hexadecimal constants, DATA and PARAMETER:

	PARAMETER ( P1 = Z'1F' )
	INTEGER*2 N1, N2, N3, N4
	DATA N1 /B'0011111'/, N2/O'37'/, N3/X'1f'/, N4/Z'1f'/
	WRITE ( *, 1 ) N1, N2, N3, N4, P1
1	FORMAT ( 1X, O4, O4, Z4, Z4, Z4 )
	END

Note the edit descriptors in FORMAT statements: O for octal, and Z for hexadecimal. Each of the above integer constants has the value 31 decimal.

Example: Binary, octal, and hexadecimal, other than in DATA and PARAMETER:

	INTEGER*4  M, ICOUNT/1/, JCOUNT
	REAL*4  TEMP
	M = ICOUNT + B'0001000'
	JCOUNT = ICOUNT + O'777' 
	TEMP = X'FFF99A' 
	WRITE(*,*) M, JCOUNT, TEMP
	END

In the above example, the context defines B'0001000' and O'777' as INTEGER*4 and X'FFF99A' as REAL*4. For a real number, using IEEE floating-point, a given bit pattern yields the same value on different architectures.

The above statements are treated as the following:

	M = ICOUNT + 8
	JCOUNT = ICOUNT + 511 
	TEMP = 2.35076E-38 

Control Characters

You can enter control characters with typeless constants, although the CHAR function is standard, and this way is not.

Example: Control characters with typeless constants:

	CHARACTER BELL, ETX / X'03' /
	PARAMETER ( BELL = X'07' )

Alternate Notation for Typeless Constants

For compatibility with other versions of FORTRAN, the following alternate notation is allowed for octal and hexadecimal notation. This alternate does not work for binary, nor does it work in DATA or PARAMETER statements.

For an octal notation, enclose a string of octal digits in apostrophes and append the letter O.

Example: Octal alternate notation for typeless constants:

	'37'O
	37'O        Invalid--missing initial apostrophe
	'37'        Not numeric-- missing letter O
	'397'O      Invalid--invalid digit

For hexadecimals, enclose a string of hex digits in apostrophes and append the letter X.

Example: Hex alternate notation for typeless constants:

	'ab'X 
	3fff'X 
	'1f'X 
	'1fX        Invalid--missing trailing apostrophe 
	'3f'        Not numeric-- missing X
	'3g7'X      Invalid--invalid digit g

Here are the rules and restrictions for binary, octal, and hexadecimal constants:

Hollerith Constants

A Hollerith constant consists of an unsigned, nonzero, integer constant, followed by the letter H, followed by a string of printable characters where the integer constant designates the number of characters in the string, including any spaces and tabs.

A Hollerith constant occupies 1 byte of storage for each character.

A Hollerith constant is aligned on 2-byte boundaries.

The FORTRAN standard does not have this old Hollerith notation, although the standard recommends implementing the Hollerith feature to improve compatibility with old programs.

Hollerith data can be used in place of character-string constants. They can also be used in IF tests, and to initialize noncharacter variables in DATA statements and assignment statements, though none of these are recommended, and none are standard. These are typeless constants.

Example: Typeless constants:

	CHARACTER C*1, CODE*2 
	INTEGER TAG*2 
	DATA TAG / 2Hok / 
	CODE = 2Hno 
	IF ( C .EQ. 1HZ ) CALL PUNT

The rules and restrictions on Hollerith constants are:

If the length of a Hollerith constant or variable is greater than the length of the data type of the variable, then characters are truncated on the right.

Variables

A variable is a symbolic name paired with a storage location. A variable has a name, a value, and a type. Whatever datum is stored in the location is the value of the variable. This does not include arrays, array elements, records, or record fields, so this definition is more restrictive than the usual usage of the word "variable."

You can specify the type of a variable in a type statement. If the type is not explicitly specified in a type statement, it is implied by the first letter of the variable name: either by the usual default implied typing, or by any implied typing of IMPLICIT statements. See Section , "Types," for more details on the rules for data typing.

At any given time during the execution of a program, a variable is either defined or undefined. If a variable has a predictable value, it is defined; otherwise, it is undefined. A previously defined variable may become undefined, as when a subprogram is exited.

You can define a variable with an assignment statement, an input statement, or a DATA statement. If a variable is assigned a value in a DATA statement, then it is initially defined.

Two variables are associated if each is associated with the same storage location. You can associate variables by use of EQUIVALENCE, COMMON, or MAP statements. Actual and dummy arguments can also associate variables.


Arrays

An array is a named collection of elements of the same type. It is a nonempty sequence of data and occupies a group of contiguous storage locations. An array has a name, a set of elements, and a type.

An array name is a symbolic name for the whole sequence of data.

An array element is one member of the sequence of data. Each storage location holds one element of the array.

An array element name is an array name qualified by a subscript. See "Array Subscripts," on page 14 for details.

You can declare an array in any of the following statements:

Array Declarators

An array declarator specifies the name and properties of an array.

The syntax of an array declarator is:

a (  d [, d ] ...  )

where:

A dimension declarator has the form:

[ dl:]  du

where:

The number of dimensions in an array is the number of dimension declarators. The minimum number of dimensions is one; the maximum is seven. For an assumed-size array, the last dimension can be an asterisk.

The lower bound indicates the first element of the dimension, and the upper bound indicates the last element of the dimension. In a one-dimensional array, these are the first and last elements of the array.

Example: Array declarator, lower and upper bounds:

	REAL V(-5:5)

In the above example, V is an array of real numbers, with 1 dimension and 11 elements. The first element is V(-5); the last element is V(5).

Example: Default lower bound of 1:

	REAL V(1000)

In the above example, V is an array of real numbers, with 1 dimension and 1000 elements. The first element is V(1); the last element is V(1000).

Example: Arrays can have as many as 7 dimensions:

	REAL TAO(2,2,3,4,5,6,10)

Example: Lower bounds other than one:

	REAL A(3:5, 7, 3:5), B(0:2)

Example: Character arrays:

	CHARACTER M(3,4)*7, V(9)*4

The array M has 12 elements, each of which consists of 7 characters.

The array V has 9 elements, each of which consists of 4 characters.

The following restrictions on bounds apply:

Adjustable Arrays

An adjustable array is an array which is a dummy argument, and which has one or more of its dimensions or bounds as integer variables that are either themselves dummy arguments, or are in a common block.

You can declare adjustable arrays in the usual DIMENSION, COMMON, or type statements. In f77,you can also declare adjustable arrays in a RECORD statement, if that RECORD statement is not inside a structure declaration block.

Example: Adjustable array bounds with arguments, and variables in common;

	SUBROUTINE POPUP ( A, B, N ) 
	COMMON / DEFS / M, L, K 
	REAL A(3:5, 7, M:N), B(N+1:2*N) 

The restrictions are:

Assumed-Size Arrays

An assumed-size array is an array that is a dummy argument, and which has an asterisk as the upper bound of the last dimension.

You can declare assumed-size arrays in the usual DIMENSION, COMMON, or type statements.

In f77, the following extensions are allowed:

Example: Assumed-size with the upper bound of the last dimension an asterisk:

	SUBROUTINE PULLDOWN ( A, B, C ) 
	  INTEGER A(5, *), B(*), C(0:1, 2:*)

An assumed-size array cannot be used in an I/O list.

Array Names with No Subscripts

An array name with no subscripts indicates the entire array. It can appear in any of the following statements:

In an EQUIVALENCE statement, the array name without subscripts indicates the first element of the array.

Array Subscripts

An array element name is an array name qualified by a subscript.

Form of a Subscript

A subscript is a parenthesized list of subscript expressions. There must be one subscript expression for each dimension of the array.

The form of a subscript is:

( s [, s ] ... )

where s is a subscript expression. The parentheses are part of the subscript.

Example: Declare a two-by-three array with the declarator:

	REAL M(2,3)

With the above declaration, you can assign a value to a particular element, as follows:

	M(1,2) = 0.0

The above code assigns 0.0 to the element in row 1, column 2, of array M.

Subscript Expressions

Subscript expressions have the following properties and restrictions:

In the above example, the fourth element of V is set to zero.

Subscript expressions cannot exceed the range of INTEGER*4. It is not controlled, but if the subscript expression is not in the range
(-2147483648, 2147483647), then the results are unpredictable.

Array Ordering

Array elements are usually considered as being arranged with the first subscript as the row number and the second subscript as the column number. This corresponds to traditional mathematical nxm matrix notation:

a1,1 a1,2 a1,3 ... a1,m
a2,1 a2,2 ...   a2,m
... ... ai,j ... ai,m
an,1 an,2 ...   an,m

Element ai,j is located in row i, column j.

For example:

	INTEGER*4 A(3,2) 

The elements of A are conceptually arranged in 3 rows and 2 columns:

A(1,1) A(1,2)
A(2,1) A(2,2)
A(3,1) A(3,2)

Array elements are stored in column-major order.

Example: For the array A, they are located in memory as follows:

A(1,1)
A(2,1)
A(3,1)
A(1,2)
A(2,2)
A(3,2) 

The inner (leftmost) subscript changes more rapidly.


Substrings

A character datum is a sequence of one or more characters. A character substring is a contiguous portion of a character variable or of a character array element or of a character field of a structured record.

A substring name can be in either of the following two forms:

v( [ e1 ] : [ e2 ] )
a(  s [, s ] ... ) (  [ e1 ] : [ e2 ] )

where:

v Character variable name
a(s [, s] ... ) Character array element name
e1 Leftmost character position of the substring
e2 Rightmost character position of the substring

Both e1 and e2 are integer expressions. They cannot exceed the range of INTEGER*4. If the expression is not in the range (-2147483648, 2147483647), then the results are unpredictable.

Example: The string with initial character from the Ith character of S and with the last character from the Lth character of S:

	S(I:L)

In the above example, there are L-I+1 characters in the substring.

The following string has an initial character from the Mth character of the array element A(J,K), with the last character from the Nth character of that element.

	A(J,K)(M:N)

In the above example, there are N-M+1 characters in the substring.

Here are the rules and restrictions for substrings:

Examples: Substrings--the value of the element in column 2, row 3 is e23:

demo% cat sub.f 
	character 		v*8 / 'abcdefgh' /, 
& 			m(2,3)*3 / 'e11', 'e21', 
& 			'e12', 'e22', 
& 			'e13', 'e23' / 
	print *, v(3:5) 
	print *, v(1:) 
	print *, v(:8) 
	print *, v(:) 
	print *, m(1,1) 
	print *, m(2,1) 
	print *, m(1,2) 
	print *, m(2,2) 
	print *, m(1,3) 
	print *, m(2,3) 
	print *, m(1,3)(2:3) 
	end
demo% f77 sub.f 
sub.f: 
 MAIN: 
demo% a.out 
 cde 
 abcdefgh 
 abcdefgh 
 abcdefgh 
 e11 
 e21 
 e12 
 e22 
 e13 
 e23 
 13 
demo% 

Structures

A structure is a generalization of an array.

Just as an array is a collection of elements of the same type, a structure is a collection of elements that are not necessarily of the same type.

As elements of arrays are referenced by using numeric subscripts, so elements of structures are referenced by using element (or field) names.

The structure declaration defines the form of a record by specifying the name, type, size, and order of the fields that constitute the record. Once a structure is defined and named, it can be used in RECORD statements, as explained in the following subsections.

Syntax

The structure declaration has the following syntax:

	STRUCTURE [/structure-name/] [field-list]
		 field-declaration 
		[field-declaration]
		. . . 
		[field-declaration]
	END STRUCTURE
structure-name
Name of the structure
field-list
List of fields of the specified structure
field-declaration
Defines a field of the record. 

field-declaration is defined in the next section.

Field Declaration

Each field declaration can be one of the following:

Example: A STRUCTURE declaration:

	STRUCTURE /PRODUCT/ 
		INTEGER*4 ID 
		CHARACTER*16 NAME 
		CHARACTER*8 MODEL 
		REAL*4 COST 
		REAL*4 PRICE 
	END STRUCTURE 

In the above example, a structure named PRODUCT is defined to consist of the five fields ID, NAME, MODEL, COST, and PRICE. For an example with a field-list, see "Structure within a Structure" on page 53.

Rules and Restrictions for Structures

Note the following:

Rules and Restrictions for Fields

Fields that are type declarations use the identical syntax of normal FORTRAN type statements. All f77 types are allowed, subject to the following rules and restrictions:

In a structure declaration, the offset of field n is the offset of the preceding field, plus the length of the preceding field, possibly corrected for any adjustments made to maintain alignment. See Appendix C, "Data Representations," for a summary of storage allocation.

Record Declaration

The RECORD statement declares variables to be records with a specified structure, or declares arrays to be arrays of such records.

The syntax of a RECORD statement is:

	RECORD /structure-name/ record-list 
		[,/structure-name/ record-list] 
		... 
		[,/structure-name/ record-list]
structure-name
Name of a previously declared structure
record-list List of variables, arrays, or arrays with dimensioning and index ranges, separated by commas.

Example: A RECORD that uses the previous STRUCTURE example:

	RECORD /PRODUCT/ CURRENT, PRIOR, NEXT, LINE(10)

Each of the three variables, CURRENT, PRIOR, and NEXT, is a record which has the PRODUCT structure; LINE is an array of 10 such records.

Note the following rules and restrictions for records:

Record and Field Reference

You can refer to a whole record, or to an individual field in a record, and since structures can be nested, a field can itself be a structure, so you can refer to fields within fields, within fields, and so forth.

The syntax of record and field reference is:

	record-name[.field-name] ... [.field-name]
record-name Name of a previously defined record variable
field-name Name of a field in the record immediately to the left.

Example: References that are based on structure and records of the above two examples:

	...
	RECORD /PRODUCT/ CURRENT, PRIOR, NEXT, LINE(10) 
	... 
	CURRENT = NEXT 
	LINE(1) = CURRENT
	WRITE ( 9 ) CURRENT 
	NEXT.ID = 82

In the above example:

Example: Structure and record declarations, record and field assignments:

demo% cat str1.f
* str1.f Simple structure 
	STRUCTURE / S / 
		INTEGER*4 I 
		REAL*4 R 
	END STRUCTURE 
	RECORD / S / R1, R2 
	R1.I = 82 
	R1.R = 2.7182818 
	R2 = R1 
	WRITE ( *, * ) R2.I, R2.R 
	STOP 
	END 
demo% f77 -silent str1.f 
demo% a.out 
82 2.718280 
demo%

Substructure Declaration

A structure can have a field that is also a structure. Such a field is called a substructure. You can declare a substructure in one of two ways:

Record within a Structure

A nested structure declaration is one that is contained within either a structure declaration or a union declaration. You can use a previously defined record within a structure declaration.

Example: Define structure SALE using previously defined record PRODUCT:

	STRUCTURE /SALE/ 
		CHARACTER*32  BUYER 
		INTEGER*2  QUANTITY 
		RECORD 	/PRODUCT/  ITEM 
	END STRUCTURE

In the above example, the structure SALE contains three fields. BUYER, QUANTITY, and ITEM, where ITEM is a record with the structure, /PRODUCT/.

Structure within a Structure

You can nest a declaration within a declaration.

Example: If /PRODUCT/ is not declared previously, then you can declare it within the declaration of SALE:

	STRUCTURE /SALE/ 
		CHARACTER*32  BUYER 
		INTEGER*2  QUANTITY 
		STRUCTURE /PRODUCT/ ITEM 
			INTEGER*4  ID 
			CHARACTER*16  NAME 
			CHARACTER*8  MODEL 
			REAL*4  COST 
			REAL*4  PRICE 
		END STRUCTURE 
	END STRUCTURE 

Here, the structure SALE still contains the same three fields as in the prior example: BUYER, QUANTITY, and ITEM. The field ITEM is an example of a field-list (in this case, a single-element list), as defined under "Structure Declaration."

The size and complexity of the various structures determine which style of substructure declaration is best to use in a given situation.

Field Reference in Substructures

You can refer to fields within substructures.

Example: Refer to fields of substructures (PRODUCT and SALE, from the previous examples, are defined in the current program unit):

	... 
	RECORD /SALE/ JAPAN 
	... 
	N = JAPAN.QUANTITY 
	I = JAPAN.ITEM.ID 
	... 

Rules and Restrictions for Substructures

Note the following:

Unions and Maps

A union declaration defines groups of fields that share memory at runtime.

Syntaxes

The syntax of a union declaration is:

	UNION 
		 map-declaration 
		 map-declaration 
		[map-declaration] 
		... 
		[map-declaration] 
	END UNION

The syntax of a map declaration is as follows.

	MAP 
		 field-declaration 
		[field-declaration] 
		... 
		[field-declaration] 
	END MAP

Fields in a Map

Each field-declaration in a map declaration can be one of the following:

A map declaration defines alternate groups of fields in a union. During execution, one map at a time is associated with a shared storage location. When you reference a field in a map, the fields in any previous map become undefined and are succeeded by the fields in the map of the newly referenced field. The amount of memory used by a union is that of its biggest map.

Example: Declare the structure /STUDENT/ to contain either NAME, CLASS, and MAJOR--or NAME, CLASS, CREDITS, and GRAD_DATE:

	STRUCTURE /STUDENT/ 
		CHARACTER*32  NAME 
		INTEGER*2  CLASS 
		UNION 
			MAP 
				CHARACTER*16 MAJOR 
			END MAP 
			MAP 
				INTEGER*2  CREDITS 
				CHARACTER*8  GRAD_DATE 
			END MAP 
		END UNION 
	END STRUCTURE

If you define the variable PERSON to have the structure /STUDENT/ from the above example, then PERSON.MAJOR references a field from the first map, and PERSON.CREDITS references a field from the second map. If the variables of the second map field are initialized, and then the program references the variable PERSON.MAJOR, the first map becomes active, and the variables of the second map become undefined.


Pointers

The POINTER statement establishes pairs of variables and pointers. Each pointer contains the address of its paired variable.

Syntax Rules



	POINTER ( p1, v1 ) [, ( p2, v2 ) ... ] 

The POINTER statement has the following syntax:

where:

A pointer-based variable is a variable paired with a pointer in a POINTER statement. A pointer-based variable is usually just called a based variable. The pointer is the integer variable that contains the address.

Example: A simple POINTER statement:

	POINTER ( P, V )

Here, V is a pointer-based variable, and P is its associated pointer.

Usage of Pointers

Normal use of pointer-based variables involves the following steps. The first two steps can be in either order.

1. Define the pairing of the pointer-based variable and the pointer in a POINTER statement.
2. Define the type of the pointer-based variable.
The pointer itself is integer type, but in general, it is safer if you not list it in an INTEGER statement.
3. Set the pointer to the address of an area of memory that has the appropriate size and type.
You do not normally do anything else explicitly with the pointer.
4. Reference the pointer-based variable.
Just use the pointer-based variable in normal FORTRAN statements--the address of that variable is always from its associated pointer.

Address and Memory

No storage for the variable is allocated when a pointer-based variable is defined, so you must provide an address of a variable of the appropriate type and size, and assign the address to a pointer, usually with the normal assignment statement or data statement.

The loc(), malloc(), and free() routines associate and deassociate memory addresses with pointers. (These routines are described in Chapter 6.)

Address by LOC() Function

You can obtain the address from the intrinsic function LOC().

Example: Use the LOC() function to get an address:

* ptr1.f: Assign an address via LOC() 
	POINTER ( P, V ) 
	CHARACTER A*12, V*12 
	DATA A / 'ABCDEFGHIJKL' / 
	P = LOC( A ) 
	PRINT *, V(5:5) 
	END

In the above example, the CHARACTER statement allocates 12 bytes of storage for A, but no storage for V. It merely specifies the type of V because V is a pointer-based variable, then assign the address of A to P, so now any use of V will refer to A by the pointer P. The program prints an E.

Memory and Address by MALLOC() Function

The function MALLOC() allocates an area of memory and returns the address of the start of that area. The argument to the function is an integer specifying the amount of memory to be allocated, in bytes. If successful, it returns a pointer to the first item of the region; otherwise, it returns an integer 0. The region of memory is not initialized in any way.

Example: Memory allocation for pointers, by MALLOC:

	COMPLEX Z
	REAL X, Y
	POINTER ( P1, X ), ( P2, Y ), ( P3, Z ) 
	... 
	P1 = MALLOC ( 10000 ) 
	... 

In the above example, MALLOC() allocates 10,000 bytes of memory and associates the address of that block of memory with the pointer P1.

Deallocation of Memory by FREE() Subroutine

The subroutine FREE() deallocates a region of memory previously allocated by MALLOC(). The argument given to FREE() must be a pointer previously returned by MALLOC(), but not already given to FREE(). The memory is returned to the memory manager, making it unavailable to the programmer.

Example: Deallocate via FREE:

	POINTER ( P1, X ), ( P2, Y ), ( P3, Z ) 
	... 
	P1 = MALLOC ( 10000 ) 
	... 
	CALL FREE ( P1 ) 
	... 

In the above example, MALLOC() allocates 10,000 bytes of memory, which are associated with pointer P1. FREE() later returns those same 10,000 bytes to the memory manager.

Restrictions

The pointers are of type integer, and are automatically typed that way by the compiler. You must not type them yourself.

A pointer-based variable cannot itself be a pointer.

The pointer-based variables can be of any type, including structures.

No storage is allocated when such a pointer-based variable is declared, even if there is a size specification in the type statement.

You cannot use a pointer-based variable as a dummy argument or in COMMON, EQUIVALENCE, DATA, or NAMELIST statements.

The dimension expressions for pointer-based variables must be constant expressions in main programs. In subroutines and functions, the same rules apply for pointer-based array variables as for dummy arguments--the expression can contain dummy arguments and variables in common. Any variables in the expressions must be defined with an integer value at the time the subroutine or function is called.

Address expressions cannot exceed the range of INTEGER*4. If the expression is not in the range (-2147483648, 2147483647), then the results are unpredictable.

Optimization and Pointers

Pointers have the annoying side effect of reducing the assumptions that the global optimizer can make. For one thing, compare the following:

Therefore, the optimizer must assume that a variable passed as an argument in a subroutine or function call can be changed by any other call. Such an unrestricted use of pointers would degrade optimization for the vast majority of programs that do not use pointers.

General Guidelines

There are two alternatives for optimization with pointers.

The second choice also has a suboption: localize pointers to one routine and do not optimize it, but do optimize the routines that do the calculations. If you put the calling the routines on different files, you can optimize one and not optimize the other.

Example: A relatively "safe" kind of coding with -O3 or -O4:

	REAL A, B, V(100,100) 	! This programming unit
	POINTER ( P, V )      	! does nothing else with P
	P = MALLOC(10000)     	! other than getting the address and passing it.
	... 
	CALL CALC ( P, A )
	...
	END

    
	SUBROUTINE CALC ( ARRAY, X )
	...
	RETURN
	END

If you want to optimize only CALC at level -O4, then avoid using pointers in CALC.

Some Problematic Code Practices

Any of the following coding practices, and many others, could cause problems with an optimization level of -O3 or -O4:

Example: One kind of code that could cause trouble with -O3 or -O4:

	COMMON A, B, C 
	POINTER ( P, V ) 
	P = LOC(A) + 4 			  ! Possible problems if optimized
	... 
The compiler assumes that a reference through P may change A, but not B; this assumption could produce incorrect code.