INTRODUCTION This document sets out to describe the C programming language, as defined by the ANSI standard, in order to enable a computer programer with no previous knowledge of the C programming language to program in C. It is assumed that the reader has access to a C compiler, and to the documentation which accompanies it regarding library functions. History: The C programming language was invented by Dennis Ritchie for use on a DEC PDP-11 during the 1970s. Since then it has become one of the most widely used and respected programming languages available for computers. The reason for this success is two fold; Firstly the C programming language is portable between different computers. So a program may be developed on an IBM PC and recompiled on a DEC VAX and still work without any changes to the code. Secondly C provides the programmer full access to the computer's operating system and memory. In practice this often means that the C programmer has complete freedom to make a complete mess of the operating system! But it is a level of power and freedom only offered by Assembler language programming other than C. C is a medium level language: The powerful facilities offered by C to allow manipulation of direct memory addresses and data, even down to the bit level, along with C's structured approach to programming cause C to be classified as a "medium level" programming language. It posesses less ready made facilities than a high level language, such as BASIC, but a higher level of structure than low level Assembler. Key words: The original C language as described in; "The C programming language", by Kernighan and Ritchie, provided 27 key words. To those 27 the ANSI standards committee on C have added 5 more. This confusingly results in two standards for the C language. However, the ANSI standard is quickly taking over from the old K & R standard. The 32 C key words are; auto double int struct break else long switch case enum register typedef char extern return union const float short unsigned continue for signed void default goto sizeof volatile do if static while Some C compilers offer additional key words specific to the hardware environment that they operate on. You should be aware of your own C compilers additional key words, but they have no place in portable code. Structure: C programs are written in a structured manner. A collection of code blocks are created which call each other to comprise the complete program. As a structured language C provides various looping and testing commands such as; do-while, for, while, if and the use of jumps, whilst provided for, are rarely used. A C code block is contained within a pair of curly braces "{ }", and may be a complete procedure, in C terminology called a "function", or a subset of code within a function. For example the following is a code block. The statements within the curly braces are only executed upon satisfaction of the condition that "x < 10"; if (x < 10){ a = 1; b = 0; } Whilst this is a complete function code block containing a sub code block in the form of a do-while loop; int GET_X() { int x; do{ printf("\nEnter a number between 0 and 10 "); scanf("%d",&x); }while(x < 0 || x > 10); return(x); } Notice how every statement line is terminated in a semicolon, unless that statement marks the start of a code block, in which case it is followed by a curly brace. C is a case sensitive but free flow language, spaces between commands are ignored, and hence the semicolon delimiter is required to mark the end of the command line. Having a freeflow structure the following commands are recognised as the same by the C compiler; x = 0; x =0; x=0; The general form of a C program is as follows; compiler preprocessor statements global data declerations return-type main(parameter list) { statements } return-type f1(parameter list) { statements } return-type f2(parameter list) { statements } . . . return-type fn(parameter list) { statements } Comments: C allows comments to be included in the program. A comment line is defined by being enclosed within "/*" and "*/". Thus the following is a comment; /* This is a legitimate C comment line */ Libraries: C programs are compiled and combined with library functions provided with the C compiler. These libraries are of generally standard functions, the functionality of which are defined in the ANSI standard of the C language, but which are provided by the individual C compiler manufacturers to be machine dependant. Thus, the standard library function "printf()" provides the same facilities on a DEC VAX as on an IBM PC, although the actual machine language code in the library is quite different for each. The C programmer however, does not need to know about the internals of the libraries, only that each library function will behave in the same way on any computer. DATA TYPES There are four basic types of data in the C language; character, integer, floating point, and valueless which are refered to by the C key words; "char", "int", "float" and "void" respectively. To the basic data types may be added the type modifiers; signed, unsigned, long and short to produce further data types. By default data types are assumed signed, and the signed modifier is rarely used, unless to overide a compiler switch defaulting a data type to unsigned. The size of each data type varies from one hardware platform to another, but the minimal range of values which can be held is described in the ANSI standard as follows; TYPE SIZE Range char 8 -127 to 127 unsigned char 8 0 to 255 int 16 -32767 to 32767 unsigned int 16 0 to 65535 long int 32 -2147483647 to 2147483647 unsigned long int 32 0 to 4294967295 float 32 Six digit precision double 64 Ten digit precision long double 80 Ten digit precision Declaring a variable: All variables in a C program must be declared before they can be used. The general form of a variable definition is; type name; So, for example to declare a variable "x", of data type "int" so that it may store a value in the range -32767 to 32767, you use the statement; int x; Character strings may be declared, which are in fact arrays of characters. They are declared as follows; char name[number_of_elements]; So, to declare a string thirty characters long, and called 'name' you would use the decleration; char name[30]; Arrays of other data types may also be declared in one, two or more dimensions in the same way. For example to declare a two dimensional array of integers; int x[10][10]; The elements of this array are then accessed as; x[0][0] x[0][1] x[n][n] There are three levels of access to variable; local, module and global. A variable declared within a code block is only known to the statements within that code block. A variable declared outside of any function code blocks but prefixed with the storage modifier "static" is known only to the statements within that source module. A variable declared outside of any functions and not prefixed with the static storage type modifier may be accessed by any statement within any source module of the program. For example; int error; static int a; main() { int x; int y; } funca() { if (a == 0){ int b; for(b = 0; b < 20; b++) printf("\nHello World"); } } In this example the variable 'error' is accessible by all source code modules compiled together to form the finished program. The variable 'a' is accessible by statements in both functions 'main()' and 'funca()', but is invisible to any other source module. Variables 'x' and 'y' are only accessible by statements within function 'main()'. The variable 'b' is only accessible by statements within the code block following the 'if' statement. If a second source module withed to access the variable 'error' it would need to declare 'error' as an 'extern' global variable thus; extern int error; funcb() { } C will quite happily allow you, the programer, to assign different data types to each other. For example, you may declare a variable to be of type 'char' in which case a single byte of data will be allocated to store the variable. To this variable you can attempt to allocate larger values, for example; main() { x = 5000; } In this example the variable 'x' can only store a value between -127 and 128, so the figure 5000 will NOT be assigned to the variable 'x'. Rather the value 136 will be assigned! Often you may wish to assign different data types to each other, and to prevent the compiler from warning you of a possible error you can use a cast to tell the compiler that you know what you're doing. A cast statement is a data type in parenthesis preceeding a variable or expression; main() { float x; int y; x = 100 / 25; y = (int)x; } In this example the (int) cast tells the compiler to convert the value of the floating point variable x to an integer before assigning it to the variable y. Formal parameters: A C function may be receive parameters from a calling function. This parameters are declared as variables within the paranthesis of the function name, thus; int MULT(int x, int y) { /* Return parameter x multiplied by parameter y */ return(x * y); } main() { int a; int b; int c; a = 5; b = 7; c = MULT(a,b); printf("%d multiplied by %d equals %d\n",a,b,c); } Access modifiers: There are two access modifiers; 'const' and 'volatile'. A variable declared to be 'const' may not be changed by the program, whereas a variable declared as type 'volatile' may be changed by the program. In addition, declaring a variable to be volatile prevents the C compiler from allocating the variable to a register, and reduces the optimization carried out on the variable. Storage class types: C provides four storage types; 'extern', 'static', 'auto' and 'register'. The extern storage type is used to allow a source module within a C program to access a variable declared in another source module. Static variables are only accessible within the code block which declared them, and additionaly if the variable is local, rather than global, they retain their old value between subsequent calls to the code block. Register variables are stored within CPU registers where ever possible, providing the fastest possible access to their values. The auto type variable is only used with local variables, and declares the variable to retain its value locally only. Since this is the default for local variables the auto storage type is very rarely used. OPERATORS Operators are tokens which cause a computation to occur when applied to variables. C provides the following operators; & Address * Indirection + Unary plus - Unary minus ~ Bitwise compliment ! Logical negation ++ As a prefix; preincrement As a sufix; postincrement -- As a prefix; predecrement As a sufix; postdecrement + Addition - Subtraction * Multiply / Divide % Remainder << Shift left >> Shift right & Bitwise AND | Bitwise OR ^ Bitwise XOR && Logical AND || Logical OR = Assignment *= Assign product /= Assign quotient %= Assign remainder (modulus) += Assign sum -= Assign difference <<= Assign left shift >>= Assign right shift &= Assign bitwise AND |= Assign bitwise OR ^= Assign bitwise XOR < Less than > Greater than <= Less than or equal to >= Greater than or equal to == Equal to != Not equal to . Direct component selector -> Indirect component selector a ? x:y "If a is true then x else y" [] Define arrays () Parenthesis isolate conditions and expressions ... Ellipsis are used in formal parameter lists of function prototypes to indicate a variable number of parameters or parameters of varying types. To illustrate some of the more commonly used operators consider the following short program; main() { int a; int b; int c; a = 5; /* Assign a value of 5 to variable 'a' */ b = a / 2; /* Assign the value of 'a' divided by two to variable 'b' */ c = b * 2; /* Assign the value of 'b' multiplied by two to variable 'c' */ if (a == c) /* Test if 'a' holds the same value as 'c' */ printf("Variable 'a' is an even number\n"); else printf("Variable 'a' is an odd number\n"); } Normaly when incrementing the value of a variable you would write something like; x = x + 1 C provides the incremental operator '++' as well so that you can write; x++ Similarly you can decrement the value of a variable using '--' as; x-- All the other mathematical operators may be used the same, so in a C program you can write in shorthand; NORMAL C x = x + 1 x++ x = x - 1 x-- x = x * 2 x *= 2 x = x / y x /= y x = x % 5 x %= 5 and so on. INPUT AND OUTPUT Input to a C program may occur from the console, the standard input device (unless otherwise redirected this is the console), from a file or from a data port. The general input command for reading data from the standard input stream 'stdin' is scanf(). Scanf() scans a series of input fields, one character at a time. Each field is then formatted according to the appropriate format specifier passed to the scanf() function as a parameter. This field is then stored at the ADDRESS passed to scanf() following the format specifiers list. For example, the following program will read a single integer from the stream stdin; main() { int x; scanf("%d",&x); } Notice the address operator & prefixing the variable name 'x' in the scanf() parameter list. This is because scanf() stores values at ADDRESSES rather than assigning values to variables directly. The format string is a character string that may contain three types of data: whitespace characters (space, tab and newline), non-whitespace characters (all ascii characters EXCEPT %) and format specifiers. Format specifiers have the general form; %[*][width][h|l|L]type_character After the % sign the format specifier is comprised of: an optional assignment suppression character, *, which suppresses assignment of the next input field. an optional width specifier, width, which declares the maximum number of characters to be read. an optional argument type modifier, h or l or L, where: h is a short integer l is a long L is a long double the data type character which is one of; d Decimal integer D Decimal long integer o Octal integer O Octal long integer i Decimal, octal or hexadecimal integer I Decimal, octal or hexadecimal long integer u Decimal unsigned integer U Decimal unsigned long integer x Hexadecimal integer X Hexadecimal long integer e Floating point f Floating point g Floating point s Character string c Character % % is stored An example using scanf(); #include main() { char name[30]; int age; printf("\nEnter your name and age "); scanf("%30s%d",name,&age); printf("\n%s %d",name,age); } Notice the include line, "#include ", this is to tell the compiler to also read the file stdio.h which contains the function prototypes for scanf() and printf(). If you type in and run this sample program you will see that only one name can be entered, that is you can't enter; JOHN SMITH because scanf() detects the whitespace between "JOHN" and "SMITH" and moves on to the next input field, which is age, and attempts to assign the value "SMITH" to the age field! The limitations of scanf() as an input function are very obvious.