Perspective of programming languages

Almost thirty years ago, a noted computer scientist (1) remarked that it was unfortunate that real computers had to be us ed in teaching computer science. Although many in the audience may have viewed this as a rather radical position at the time, it has proven to be an insightful commentary on many of our efforts to design and deliver courses in the discipline. In fact, the premise probably should be broadened to include software as well as hardware. Actual computing systems, hardware as well as software, often swamp the learner in a sea of minutia in which basic concepts are at least obscured if not completely lost.

While there are difficulties in using real system s in courses at all levels, it appears that some of the greatest problems ma y be found at the introductory level. In particular, achieving consensus in the choice of a programming language (or none at all!) for CS1 has proven to be elusive. With Curriculum 2001 now in the works, it is particularly timely that experience with this course be reviewed.

Language Processor

By a language processor, we mean a program that processes programs written in a programming language (source language). All or part of a language processor is a language translator, which translates the program from the source language into machine code, assembly language, or some other language. The machine code can be for an actual computer or for a virtual (hypothetical) computer. If it is for a virtual computer, then a simulator for the virtual computer is needed in order to execute the translated program.
If a language processor is a translator that produces machine or assembly code as output (in object code or executable code) then it is called a compiler. If the language processor executes the translated program (output from the translator) then it is called an interpreter.

In a typical programming language implementation, source program components (files or modules) are first translated into machine language to produce components called object modules or object files. Following the translation step, a linkage editor (or linker) combines multiple object components for a program with components from libraries to produce an executable program. This can occur either as an intermediate step, or in some cases it may occur as the program executes, loading each component as it is needed. The execution of a program may be done by an actual computer or by a simulator for a virtual computer.

Program components in languages such as C are normally compiled into object files, which are combined into an executable file by a linkage editor or linking loader. The linkage editor adjusts addresses as needed when it combines the object modules, and it also puts in the addresses where a module references a location in another module (such as for a function call). If an executable file is produced, then there will also be a loader program that loads an executable file into memory so that it can execute. The loader may also do some final adjustments on addresses to correspond to the actual locations in memory where the executing program will reside.

Data-Level Structure

In computer science, a data structure is a particular way of organizing data in a computer so that it can be used efficiently.


Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks. For example, B-trees are particularly well-suited for implementation of databases, while compiler implementations usually use hash tables to look up identifiers.

Data structures provide a means to manage large amounts of data efficiently, such as large databases and internet indexing services. Usually, efficient data structures are a key in designing efficient algorithms. Some formal design methods and programming languages emphasize data structures, rather than algorithms, as the key organizing factor in software design. Storing and retrieving can be carried out on data stored in both main memory and in secondary memory.

Program-level Structure

Structured programming is a programming paradigm aimed at improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures and for and while loops—in contrast to using simple tests and jumps such as the goto statement which could lead to "spaghetti code" which is difficult both to follow and to maintain.

It is possible to do structured programming in any programming language, though it is preferable to use something like a procedural programming language. Some of the languages initially used for structured programming languages include: ALGOL, Pascal, PL/I and Ada – but most new procedural programming languages since that time have included features to encourage structured programming, and sometimes deliberately left out features – notably GOTO – in an effort to make unstructured programming more difficult.

Control-level Structure

Programs written in procedural languages, the most common kind, are like recipes, having lists of ingredients and step-by-step instructions for using them. The three basic control structures in virtually every procedural language are:
  • 1. Sequence—combine the liquid ingredients, and next add the dry ones.
  • 2. Conditional—if the tomatoes are fresh then simmer them, but if canned, skip this step.
  • 3. Iterative—beat the egg whites until they form soft peaks.

Following the structured program theorem, all programs are seen as composed of three control structures:
  • "Sequence"; ordered statements or subroutines executed in sequence.
  • "Selection"; one or a number of statements is executed depending on the state of the program. This is usually expressed with keywords such as if..then..else..endif.
  • "Iteration"; a statement or block is executed until the program reaches a certain state, or operations have been applied to every element of a collection. This is usually expressed with keywords such as while, repeat, for or do..until. Often it is recommended that each loop should only have one entry point (and in the original structural programming, also only one exit point, and a few languages enforce this).

Subroutines

Subroutines; callable units such as procedures, functions, methods, or subprograms are used to allow a sequence to be referred to by a single statement.

Blocks

Blocks are used to enable groups of statements to be treated as if they were one statement. Block-structured languages have a syntax for enclosing structures in some formal way, such as an if-statement bracketed by if..fi as in ALGOL 68, or a code section bracketed by BEGIN..END, as in PL/I, whitespace indentation as in Python - or the curly braces {...} of C and many later languages.