A natural language is a medium of communication between human being. The natural languages such as Hindi, Punjabi, English, etc. is used to communicate with each other our ideas and emotions. Similarly, a computer language is a means of communication used to communicate between people and the computer. With the help of a computer language, a programmer tells a computer what he wants it to do. All natural languages use a standard set of symbols for the purpose of communication. These symbols are understood by everyone using that language. We normally call this set of symbols the vocabulary of that particular language. For example, the words we use in English are the symbols of English language that make up its vocabulary. Each word has definite meaning which can be looked up in a dictionary. In a similar manner, all computer languages have a vocabulary of their own. Each symbol of the vocabulary has definite unambiguous meaning which can be looked up in the manual meant for that language. Hence, each symbol of a computer language is used to tell the computer to do a particular job. The main difference between natural language and computer language is that natural languages have a large vocabulary but most computer languages use a very limited or restricted vocabulary. Hence, each and every problem to be solved by a computer has to be broken down into discrete (simple and separate), logical steps which basically comprise off our fundamental operations.
Every natural language has a systematic method of using symbols, such as English language uses rules of grammar for constructing the sentences. These rules tell us which words to use and how to use them. Similarly, the symbols of a particular computer language must also be used as per set rules which are known as the syntax rules of the language. In case of a natural language, people can use poor or incorrect vocabulary and grammar and still make themselves understood. However, computers, being a machine, are receptive only to exact vocabulary used correctly as per syntax rules of the language being used.
Why Programming Languages?
The main purpose of using computers is to solve problems which would otherwise be difficult to solve. It must thus be possible for a computer user to concentrate on the development of good algorithms for solving problems rather than be concerned with the details of the internal structure of the computer. This situation is analogous to that which confronts the user of a motor car. From the user’s point of view a motor car is meant to take him from one place to another. If one has to know all about how the engine of the motor car works before one can drive it, not many persons will be able to drive a car. Driving must be made as simple as possible with minimum number of controls so that it would be easy for anyone to drive and fulfill the objective of owning a car. A general knowledge of how the motor car works is useful if a person wants to be a good driver and reduce wear and tear and petrol consumption. It is, however, not essential for a driver to be a good mechanic. Similarly, it is desirable, but not essential, for a programmer to know in detail how a computer works.
Computer programming languages are developed with the priority objective of facilitating a large number of people to use computers without the need to know in detail the internal structure of the computer. Languages are matched to the type of operations to be performed in algorithms for various applications. Languages are also designed to be machine-independent. In other words, the structure of a programming language would not depend upon the internal structure of a specified computer. Ideally, one should be able to execute a program on any computer regardless of who manufactured it or what model it is.
Programming languages have improved throughout the years, just as computer hardware has improved. They have progressed from machine-oriented languages that use strings of binary 1s and 0s to problem-oriented languages that use common mathematical and/or English terms.
However, the development of programming language can be distinctly divided into four generation :
1st Generation Language : Machine Languages (1940-50)
2nd Generation Language : Assembly Language (1950-58)
3rd Generation Language : High Level Languages (1958-85)
4th Generation Language : 4-GLs (1985 onwards)
Machine Language (1st Generation)
A computer system is programmed to understand many computer languages but only language is understood by computer without using any type of translation, is called the machine language or the machine code of the computer. So, we can say that the Machine Language is the language directly understood by a computer. In other words, the binary language (the language of 0’s and 1’s) is the machine language is normally written as strings of binary 1s and 0s. The circuitry of a computer is also designed to recognize the machine language immediately.
Any information or instruction in this language is to be represented in terms of 0s and 1s, the symbol 0 standing for the absence of an electric pulse and 1 for the presence of an electric pulse. As a computer is able to recognise the presence or absence of an electric pulse, it is able to understand the machine language. For example, a sequence of 0s and 1s such as 01110001 has a specific meaning for a computer, although it may appear as an ordinary binary number to us. So, the writing of programs in machine language is very complicated and difficult task. Only experts can use this language.
In Machine language, instruction format is divided into two parts. The first part is operation code which tell the computer to perform such a function. This is also called ‘opcode’. Second part of the instruction is the operand which tell the computer about the location of data or another instruction on which the operation will be performed. Each instruction tell the control unit of the CPU what to do and the length of location of the data fields that are involved in the operation. Typical operations include reading, adding, subtracting and writing and so on.
Machine language consists of strings of binary numbers and is the only one which directly understands by the CPU. For example, a program instruction is as under which consists of binary numbers ( 1s and 0s) :
The program to add two numbers in memory and print the result might look like the following :
Advantages and Limitations of Machine Language
- Programs written in machine language can be executed very fast by the computer. This is mainly because machine instructions are directly understood by the CPU and no translation of the program is required.
- Small in size
- All programs a combination of 1 and 0
- Machine dependent : The internal design of every computer is different and needs different electrical signals to operate, so the machine language for each of this type of computer is different. It is determined by the actual design of ALU, the control unit and the word length of memory unit. For shifting a program from one machine to another, complete program has to be rewritten.
- Difficult to program : To create a program in machine language is very difficult. It creates difficulty for programmer to remember dozens of code numbers. It also forced the programmer to keep track of the storage location of data. Knowledge of hardware structure is must.
- Error prone : A programmer can not memorize number of opcodes every time, so the chances of errors in programming is high. Thus, a machine language program is error prone in writing.
- Difficult to modify : Modification in a machine language program is difficult. Locating instruction of errors is so difficult as writing new program. A programmer prefer to write new one instead of correct the old machine language program.
Assembly Language (2nd Generation)
The purpose of introducing simple machine language programs was to illustrate the difficulties posed by this language while writing even very trivial programs. Some of these difficulties arising in use of machine language could be overcome by use of Assembly Language for writing of programs.
Step in improving the program preparation process was to substitute letter symbols mnemonics for numeric operation codes of machine language. A mnemonic is any kind of mental trick we use to help us remember. Mnemonics comes in various shapes and sizes. For example, a computer is designed to interpret machine code of 1111 (binary) or 15 (decimal) as operation ‘subtract’ but it is easier for a human being to remember it as SUB.
A computer can be taught to recognize certain combination of letters or numbers. It can be taught to substitute the number 14 every time it sees the symbol ADD, substitute the number 15 every time it sees the symbol SUB, and so forth. In this way, the computer can be trained to translate a program written with symbols instead of numbers into the computer’s own machine language. Then we can write program for the computer using symbols instead of numbers, and have the computer do its own translating. This makes it easier for the programmer, because he can use letters, symbols, and mnemonic instead of numbers for writing his programs. Example of a program adding two numbers and printing the result :
Instruction Instruction Explanation
00 CLA Clear accumulator
01 INP A Input the number A in memory from input device
02 INP B Input the number B in memory from input device
03 LDA A Load accumulator with the number A
04 ADD B Add number B to the contents of the accumulator
05 STA C Store contents of accumulator in memory location C
06 TYP C Display the result C though the output device
07 HLT Stop or Halt the execution of the instructions.
This can also be programmed as
Which could mean take “A” , add “B”, store the result in “C” and type “C” and halt. The computer translate each line of this program into corresponding machine language code.
Functioning of Assembler
The language which substitute letters and symbols for the numbers in the machine language program is called an Assembly Language or Symbolic Language. A program written in symbolic language that used symbols instead of numbers is called an assembly code or a symbolic program. The translator programs that translates an assembly code into machine code is called an Assembler. The Assembler is a system program which is written by system programmers. It also assembles the machine code along with translating the assembly code into machine code in the main memory and makes it ready to execute. An assembly program is called a source program and assembler converts it into an object program (machine program). The function of an assembler is to convert all instructions from assembly language program into its corresponding machine language. The correspondence from the assembly instructions of source program to machine instructions of object program is one-to-one. It can be shown by an example of sum of two numbers i.e. first statement of Assembly language is “CLA A” and the corresponding instruction in machine language is “take A”, in second instruction it “add B” and so on as discuss above.
An instruction in assembly language consists of four parts
- A label
- An Operation code
- An Operand
Instruction in assembly language are written on a single line, called a statement and
is divided into three or four parts called fields, depending upon whether the comment field is present or not.
These fields are separated from each other by separating characters, which may be colons, blanks space or semi-colons. These separators also called delimiters precede or follow the fields. The general form of a three field assembly language instruction is
It can be represented as
LABEL : Operation Code Operand
Here colons and blank spaces are used as delimiters.
The general form of a four field assembly language instruction is
It can be represented as
LABEL : Operation Code Operand ; Comment
LABEL : The label field represents the symbolic address of an instruction. However an instruction may or may not have a label field, depending upon whether this instruction is to be referred again or not in a program.
Operation Code : The operation code is an instruction mnemonic . The mnemonics have definite functions to perform and must be used for the particular purpose.
Operand : The operand consists of symbolic address of a memory location. This operand field consists of alphanumeric characters. The memory location
containing the value of a variable A1 is designed as memory location A1. The variable name A1 is symbolic address of memory location containing the value of a variable A1. The operand field must be separated from other fields by a space or by a semicolon.
Comment : The comment field is used to provide additional information such as defining a register or starting address of a program etc. This field is not converted to machine language.
Advantages of Assembly Language (Over Machine Language) :
- Easier to understand and use : Assembly language is easier to understand and use because mnemonics are used instead of numeric op-codes and suitable names are used for date. The program itself is understandable. It saves the time and effort of programmer because it is easier to write as compared to machine language.
- Easy to modify a program : Assembly Language programs modification is easier then Machine Language programs. This is mainly because they are easier to understand an hence it is easier to locate, correct, and modify instructions as and when desired. Moreover, insertion or removal of certain instructions from the program does not require change in the address part of the instructions that is required in case of machine language.
- Easy to error correction : The error location in Assembly Language is easy as compare to machine language. Assemblers are designed to catch errors automatically. If we use invalid mnemonic that has never been defined, the assembler will print out an error indication. For example, suppose an instruction in the symbolic program reads ADD AREA, and we forget to define what AREA is, the assembler will look through its table to find AREA and it will indicate the error if not found.
- Easy to relocate the address: Address relocation is easy in Assembly Language Programming because their location is easily changes merely by changing the first instruction. This is not easily done with Machine Language programming. For example, a Assembly Language program starts at address 1000 and we suddenly find that we have another program to be used with this program and this program also starts at location 1000. In Assembly Language, we merely have to change the first statement; for example instead of :
START PROGRAM AT 1000 AND START DATA AT 2000
we merely change this first statement to
START PROGRAM AT 3000 AND START DATA AT 4000
and run the symbolic program once more through the assembler. The equivalent machine language program will this time start at memory location 3000 instead of 1000, and there will be no conflict with the other program. In other words, using symbolic language we can easily move programs from one section of the memory to another. In machine language, this can be a complicated job.
- Easy about addresses : Assembly Language has a major advantage over machine language that it eliminates worry about address for instruction and data. In Machine Language it is problematic if we have written a long machine language program involving many steps and many references to itself within the program, such as looping, and address modifications. And at the very end we may suddenly discover about instruction left in the middle. Now, we wanted to insert that instruction, we will have to remember all the following instructions as we go through the entire program and their references. But in case of Assembly Language, program written in symbolic, we can add any extra instruction and the assembler will take care of the steps automatically.
- Efficiency of machine language : An assembly language program gets the efficiency of its corresponding machine code because every assembly language instruction converts into machine language instruction. For every machine language instruction, there is a corresponding symbolic instruction and for every symbolic instruction (except the pseudo-instruction) there is a corresponding machine instruction. So, leaving the translation time taken by assembler, the actual execution time of an assembly language program and its equivalent machine language program will be the same. There is one-to-one relationship between symbolic and machine languages.
Limitation of Assembly Language
Following are the limitations of Assembly Language :
- Machine Dependent : Each instruction in the Assembly Language is translated into exactly one machine language instruction. This is designed for the specific make and model of Computer processor being used. Further change to another computer requires the learning of new language or conversion of old program to new program. It become very expensive due to its machine dependency.
- Knowledge of hardware required : Assembly Languages are machine dependent,
so the programmer must be aware of particular machine’s characteristics which is being used. A programmer must know how his machine works and should have a good knowledge of the logical structure of his computer in order to write a program in assembly language.
- Machine level coding : In Assembly Language programming, instructions are still written at the machine-code level because one assembler instruction is submitted for one machine-code instruction.
High-Level Language (3rd Generation)
At the time of evolution of computers, computers were slow, and had a small memory. Thus programming efficiency was very important and assembly language was dominant. The use of computers was also limited to a small group of scientists. With improvements in technology, computers were designed with larger memory capacity, higher speed and improved reliability. The tremendous potential of computer applications in diverse area was foreseen. It was evident that this potential could be realized only if a non-expert user could effectively use the computer to solve problems. It was thus clear that a user should be concerned primarily with the development of appropriate algorithms to solve problems of interest to him and not with the details of the internal logical structure of a computer. Consequently a good notation to express algorithms became an essential requirement. It would be ideal if an algorithm written in a natural (spoken) language such as English were translated to machine language automatically by the computer and executed. This is not possible because natural languages are not precise or unambiguous. The interpretation of the meaning of a natural language sentence depends on the context also. For example, the sentence “Give me a book” may mean either give me a book to read or a book on the telephone depending on the context.
Writing of programs in machine language or assembly language requires a deep knowledge of the internal structure of the computer. While writing programs in any of these languages, a programmer has to remember all the operation codes (numeric or mnemonic) of the computer and know in detail what each code does and how it affect the various registers of the computer. But, writing a good program, a program should mainly concentrate on the logic of the problem rather than be concerned with the details of the internal structure of the computer. In order to facilitate the programmers to use computers without the need to know in detail the internal structure of the computer, high-level languages were developed.
High-level languages, instead of being machine based, are oriented more towards the problem to be solved. These languages enable the programmer to write instructions using English words and familiar mathematical symbols. So it becomes easier for him to concentrate on the logic of his problem rather than getting involved in programming details. For example, let use consider the problem of adding two numbers ( A and B ) and store the sum in SUM. Using a high-level language, say FORTRAN or instance, to instruct the computer to do this job, only one instruction need be written :
SUM = A+B
The instruction is obviously very easy to understand and write because it resembles the familiar algebraic notation for adding to number : a=b+c.
High-level languages are basically symbolic languages that use English words and/or mathematical symbols rather than mnemonic codes. In other words, a high-level language is a symbolic language with nothing but macro-instructions. Every instruction which the programmer writes in high-level language is translated into many machine language instructions. This is one to many machine translation and not one-to-one as in the case of assembly language. It is due to this reason that high-level languages are so called.
A High-level languages are generally easier to learn & write and these languages are also known as problem-oriented languages because the macro instructions are especially picked to be useful for solving particular types of problems. Each such language is then best to solve a particular class of problems and may be completely useless for solving other types of problems. For example, if a high-level language is capable of handling business-type applications that consist of high input volume, relatively little processing, and a high output volume, then the language is a business-oriented language. On the other hand, languages excellent at performing sophisticated computations but not adept at handling large data files are mathematically-oriented languages. Thus, a problem-oriented language is designed in such a way that its instructions may be written more like the language of the problem.
High Level languages may be divided into three categories :
- Procedure Oriented Language
- Problem Oriented Language
- Interactive Programming Language
High Level Languages are sometimes classified as
- General Purpose Language
- Special Purpose Language
Procedure Oriented Language : In the procedural language, the problem is viewed in sequential manner such as inputs, processing and outputs. A program in a procedural language is a list of instructions where each statement tells the computer to do something. The focus is on the processing, the algorithm needed to perform the desired computation. This model decides which procedures you want ; and use the best algorithms you can find.
Procedure-oriented programming basically consists of writing a list of instructions for the computer to follow, and organising these instructions into groups known as functions. A number of functions are written to accomplish the task of reading, calculating and printing. The primary focus is on functions. Language support this model by providing facilities for passing arguments to function and returning values from functions (sub program).
Procedure oriented languages are useful for some special applications. For example, Cobol is a procedure oriented language which is used extensively in business applications. Other examples of procedure oriented languages are Fortran and PL/1.
Characteristics of Procedural Programming
- Procedural Programming uses top-down approach in program design.
- It is emphasised on procedures(algorithms),
- In this large programs are divided into smaller programs known as functions & most of the functions share global data.
- Data move openly around the system from function to function.
Functions transform data from one form to another.
Problem Oriented Languages : Any language which is easier for writing a solution to a particular problem than assembly language is called problem oriented language. Thus, any current programming language is problem oriented. Thus, problem oriented language just attempts to solve processing requirements with minimal programming effort to allowing the user to focus on what results are desired rather than on the individual steps needed to get those results
Interactive Programming Language : Interactive programming languages allow the user to interact with the program in a conversational fashion. These languages are quite useful especially in computer aided design/computer aided manufacture(CAD/CAM). Basic, Pascal and APL are typical examples of interactive languages.
General Purpose Language : General purpose language are suitable for any type of application. e.g. Basic and Pacal
Special Purpose Language : Special purpose language are used only in special type of application. e.g. COBOL for business applications the full form of the COBOL is Common Business Oriented Language and LISP is used for Artificial Intelligence programming and the full form of LISP is List Processor.
Characteristics of a Good Programming Language :
In general, we have two types of languages. One is Low-level and other is high-level language. A programmer prefer one language over another. One obvious reason is the area of application and other reason is the characteristics of the language. Some characteristics of programming languages that affect the choose of programmer are as below:
- Simplicity : Programming languages that are simple and easy to learn and use are liked by many programmers. For example, BASIC language is used by many programmers because of its simplicity. Thus, a language should provide a programmer with a clear, simple, and unified set of concepts which can be easily grasped. Efficiency of a programming language can not be sacrificed in lieu of its simplicity.
- Efficiency : Efficiency is certainly a major element in the evaluation of any programming language. A programming language should be such that its program are efficiently translated into machine code, are efficiently executed, and acquire as little space in the memory as possible. This is the part of system programmer who is responsible to design a language.
- Environment suitable : A programming language must be suitable to its environment of applications. For example, a language designed for real time applications must be interactive in nature and a language designed for data processing may be operative in batch mode.
- Compactness : A programming language should be able to express intended operations concisely, since this is one of the fundamental reasons for having it. A language with lack of compactness can tax the programmer’s sheer writing stamina and thus reduce its usefulness. COBOL is generally not liked by many programmers because of this reason.
- Locality : A programming language should be such that while writing a program, a programmer need not jump around visually as the text of the program is prepared. COBOL lacks locality because data definitions are separated from processing statements, perhaps by many pages of codes.
- Naturalness : A language should be natural for its application area. It should be problem oriented. It should provide appropriate operators, data structures, control structures, and a natural syntax in order to facilitate the users to code their problem easily and efficiently. FORTRAN and COBOL are good examples of scientific and business languages respectively that possess high degree of naturalness.
- Structured : Structured means that the language should have necessary features to allow its users to write their programs based on the concepts of structured programming. The main reason behind this is that, this property of a language greatly affects the ease with which a program may be written, tested and maintained. It helps a programmer to check his problem in a logical way and creates less errors. PASCAL language has a property of Structured.
- Extensibility : A good programming language should also allow extension through simple, natural and elegant mechanisms. Almost all languages provide subprogram definition mechanisms for this purpose, but there are some languages that are rather weak in this aspect.
This section deals with the basic characteristics of the class of programming languages called Non-procedural Languages. These programming languages are also known as very high level languages. The other general name given to this class of programming languages are:-
- Less procedural languages.
- Goal oriented languages.
- Problem oriented languages.
These Non-procedural languages are used by a user to solve a problem with the help of a computer without writing the detailed procedural steps. In this type of programming the user states the goal or objective to be achieved and he is not concerned with how this objective is achieved. In other words, while writing programs in these languages, objective of the program along with relevant data is entered into computer and generally no procedural steps or a very few procedural steps may be provided. However, it is not possible to state that a given programming language is non-procedural in absolute sense because the term non-procedural is a relative term. However, if a particular programming language needs a very few procedural steps, it maybe said to have non-procedural features.
In general, a procedure is a series of steps followed in a regular, orderly and a specific manner, (i.e. these steps are arranged in a sequential manner) to solve a problem. A procedure is a must when writing computer programs in a procedural language.
However a computer program in a non-procedural language, is a prescription for solving a problem without any botheration of developing the procedure. Even if a few procedural steps are to be included in a program, these steps may not be in sequential order.
For example, the statement
COMPUTE CUBE ROOTS OF INTEGERS FROM 1 TO 100 AND PRINT IN TWO COLUMNS
is a non-procedural language statement, because the objective to be achieved along with the data is provided. The programmer is not to bother about the procedure that the computer is to use to obtain the required result. The machine itself is to invoke
the necessary procedure to achieve the required goal or the objective. Some of the commonly used non-procedural languages are :
- SETL Þ STRESS Þ ECARP Þ COMIT Þ SNOBOL
- RPG Þ Data base query Language
Advantages of non-procedural languages : The non-procedural languages possess the following advantages over the procedural languages. :
- Writing of programs in these languages is easier, because the programmer has to state only the results needed and is to provide the requisite input data.
- The programming in these non-procedural languages is the most efficient because the user has not to waste his time in developing the procedures.
- Programs can be written and executed even by the laymen as no knowledge about the procedures is needed.
- These languages are easier to learn because of the fact that instructions used are simple and similar to the ones used in our day to day spoken languages. These languages are the most friendly to the users.
- The programmer does not get an insight into the problem and procedure adopted for solution.
- The program modification is not possible.
- The programs when written in these languages for one computer can’t be executed by other machines. In other words, programs are not portable.
FOURTH GENERATION LANGUAGES (4GLs)
The Fourth Generation languages also called 4GLs are the highly users friendly languages, which means these languages are easy to under- stand and to write programs with, for the users.
The advancement in computer hardware made the computers simpler machines and it became easier to handle these machines. Advancement of computer-hardware necessitated the need of development of better computer languages so that advanced feature of computers could, be fully utilised.
The efforts to develop better computer language resulted in development of Fourth Generation languages. The development of these languages relieved the programmers from the cumbersome and tedious burden of remembering the syntax of the high-level computer languages and developing procedures to solve the problems with the help of computers.
The fourth generation languages are non-procedural languages and hence are highly users friendly. The user has to state the problem, provide data and also state the output required. The computer systems supporting these languages, invoke appropriate procedures to solve the problem and provide the desired output.
In fact the non-procedural nature of these languages is the reason of their popularity. The procedural part, if present in programs written in 4GLs, is about one-tenth of what is required in the same program written in third generation languages such as Basic, Fortran and Cobol etc. This low volume of procedural part increased the productivity in the ratio 10: 1. For easy use, most of the 4GLs are menu driven languages and provide interactive menus. A menu is a list of optional facilities, which can be chosen by the user in order to carry out different functions in a system. These optionally available facilities are displayed on a terminal screen in the form of a menu. Once the user has made a choice to avail a particular facility to solve a problem, the computer automatically executes a suitable program to deliver the required results.
The major advantages of these languages are that even a novice in the field of computers can solve fairly complicated problems.
Major characteristics of 4 GLs
The major characteristics of fourth generation languages are :
These characteristics are discussed below in details
Precise-Nature : The 4 GLs are very precise in nature, which means programs written in these languages need very less number of instructions. A 100 lines program written in Cobol language can be replaced by program of about 5 to 10 lines, using a fourth generation language.
Non-Procedural : Like all non-procedural languages, the programs written in these languages need either no or very less number of procedural steps. Hence, it is quite simple to write programs in these languages.
Structure-Independent. The programs written in a 4 GLs are structure independent, which means instructions may be written in any order. This eliminates the need of writing instructions in sequential order, as is the case with most of the high level languages. The computer performs the assigned task in whatever order thcse instructions may be written.
Components of a 4GL : The reader must be wondering how these 4GLs programs or other non-procedural language programs are/capable of performing the assigned tasks even when no procedures are stated. The programs written in these languages achieve the assigned targets because the required procedures are built-in and can be invoked as and when required.
A typical 4 GL has the following components..
- A data dictionary..
- A query language that interrogates a database or data bank.
- A report generator, which automatically executes a program to produce a printed report.
- A data base management system.
- Statistical analysis tools
- Financial analysis tools
- Decision support tools.
- A Graphic manipulator
- A screen generator.
- A menu.
- A dialogue generator.
- A word processor.
- A spreadsheet.
- Communication interfaces.
All these components available as separate packages are integrated in a typical 4 GL.
Some popular 4 GLs
Some of the commonly used 4 GLs are listed below.
- Ramis II
- Nomad 2
- Database query language e.g.SQL