How are programming languages ​​developed

Software infrastructure

The development of programming languages ​​is closely related to the machine language of a computer. As the name suggests, this language contains commands that a machine can execute. This means microprocessors that are built into a computer (CPU, hard disk control, etc.). These processors obey special machine instructions. They are combined into a binary machine program that the computer can execute command by command. Such programs are difficult for humans to read and even more difficult to develop.

To facilitate this development, programming languages, text editors and translation programs have been invented. The first such programming language appeared in 1948 and is called assembly language. The programmer writes an assembly language program in a text editor and saves it as a textual file. The computer cannot yet execute this file directly. This requires an auxiliary program called assembler. It translates the complete file, the so-called source code, into a machine program (see the following figure). The computer can only execute this (binary) machine program.

The assembler source code is easier to understand than a machine program. It consists of a series of detailed commands. Only a large number of these commands result in a larger program function. Even a simple Hello World program is therefore significantly longer than one in a high-level language such as Pascal (see the following figure). Writing a large assembly language program takes a long time. It is also tied to a specific processor and is more difficult to read than high-level languages ​​such as Pascal. These disadvantages are countered by the fact that assembler programs with good programming often run faster and usually require less main and hard disk space.

Despite the undisputed advantages of the assembly language, its disadvantages were such as poor readability and maintainability, poor developer productivity and, above all, the dependence on one certain Hardware (microprocessor) so blatant that just a few years later, second-generation languages ​​developed. Fortran and COBOL were among these first high-level languages. Fortran was developed by IBM from 1954 with the aim of creating a powerful, hardware-independent language for scientific purposes. The name says it all: Fortran is the short form of Formula Translation and describes the core idea of ​​the language very well, the translation of formulas.

While assembler programs are translated into a machine program by means of an assembler, Fortran uses a so-called compiler for this. The changed name already shows that this translation program clearly exceeds the scope of an assembler. A compiler not only translates the source code of a high-level language into machine code. He optimizes the program for maximum speed and minimum memory requirements.

In other words: the compiler is designed to generate a highly efficient machine program. Why was that so important? At the beginning of the 1950s, the computers of that time only had comparatively inefficient processors and extremely little memory space compared to today's computers. If the compilers hadn't been able to generate highly efficient machine programs, the first high-level programming languages ​​would probably not have caught on so quickly.

From assembler to the first high-level languages

The basic idea that led to Fortran also inspired the programming language COBOL in the late 1950s. Here, too, the focus was on hardware independence and the technical problem. In contrast to Fortran, it was not intended to develop scientific programs, but rather business programs, hence the name COBOL.

The abbreviation stands for "Common Business Oriented Language". COBOL is based heavily on natural language and, compared to Fortran, is designed to process large amounts of data. The language became one of the most widely used programming languages ​​soon after its introduction and is still widely used today.

The two languages ​​Fortran and COBOL initially had a number of deficits that could only be remedied gradually. They often resulted in poorly structured and difficult to maintain programs. As the applications became more and more complex over the years, the developers needed longer and longer for the programming, which led to the failure of various projects because the budget was exceeded.

The increasing number of failed projects triggered the first software crisis in the mid-1960s. There have been various approaches to overcome this crisis. In addition to improved development processes and the introduction of tried-and-tested programming libraries, other programming languages ​​were also created that should allow the development of more cost-effective programs.

Above all, the programming languages ​​Pascal and C have to be emphasized here. Pascal was developed by Niklaus Wirth in 1971 on the basis of the programming language Algol 68. The language was initially less used for commercial applications. Instead, it spread widely to colleges because it was well suited to learning structured programming.

Another advantage of Pascal is the strict typing. This means that variables are already assigned to a fixed data type when they are compiled by means of the compiler, which cannot be changed afterwards. The features of Pascal meant that the programs could be structured more cleanly and errors were avoided right from the start. This promotes the development of easily maintainable, robust programs.

Around the same time as Pascal, Dennis Ritchie developed the programming language with the minimalist name C at Bell Labs. It is based on its predecessor, the programming language B, hence its name. C was created to improve the programming of the Unix operating system and accordingly spread quickly in system programming.

Since C is a universal programming language, the language has also established itself for application development. C programs are trimmed for portability and efficiency according to their purpose. Thanks to the simple syntax and the sophisticated compilers, they usually run very quickly. The downside of the coin are some security-critical functions that do not make the development of easily maintainable, robust programs easier.