Introduction to Programming/About Programming

From Wikiversity
Jump to navigation Jump to search

Introduction to computer programming language

What is a program?[edit | edit source]

A computer is a tool for solving problems with data.

A program is a sequence of instructions that tell a computer how to do a task. When a computer follows the instructions in a program, we say that it executes that program. You can think of a program like a recipe that tells you how to make a peanut butter sandwich. In this model, you are the computer, making a sandwich is the task, and the recipe is the program that tells you how to execute the task.

  • Activity: Come up with a sequence of instructions to tell someone how to make a peanut butter sandwich. Don't leave any steps out, or put them in the wrong order.

Was that easy? Did you remember all the steps? Maybe you forgot to tell me to use a knife to spread the peanut butter. Now I've got peanut butter all over my hands! Of course, you say, a person wouldn't be that dumb. But a computer is that dumb. A computer will only do what you tell it to do. This might make programming frustrating at first, but it's relieving in a way: if you do everything right, you know exactly what the computer is going to do, because you told it.

Of course, computers don't understand recipes written on paper. Computers are machines, and at the most basic level, they are a collection of switches—where 1 represents "on" and 0 represents "off". Everything that a computer does is implemented in this most basic of all numbering systems—binary. If you really wanted to tell a computer what to do directly, you'd have to talk to it in binary, giving it coded sequences of 1s and 0s that tell it which instructions to execute. However, this is nearly impossible. In practice, we use a programming language.

with algorithm and flowchart for solving problem:

Programming
Programming

What is a programming language?[edit | edit source]

A programming language is, as the name would suggest, a language developed to express programs. All computers have a native programming language that they understand, commonly referred to as machine code. However, machine code is a difficult language for us to follow: amongst a number of difficulties, it is typically expressed in the binary number system, and it is unique to a particular computer architecture (thus two different computers could potentially use two different versions of machine code). Other programming languages, such as Assembly, BASIC, Java and C++ exist to provide a better interface between us, as the programmers, and the computer, by allowing programs to be expressed in a language that is easier for us to understand and potentially common to a number of computer architectures, but which can still be translated into machine code. In order for this to happen, a computer must either compile or interpret programs written in one of these languages before they can be executed.

Compilation vs Interpretation[edit | edit source]

A compiled program has to be modified into machine code before it is used. The binary is then permanently stored. As an analogy, think of a novel that was written in one language and then translated into another. For example, the Harry Potter novels were written in British English, and were then subsequently translated into 67 other languages, including Hindi, Latvian, and Latin. In much the same way, a computer program can be compiled (or "translated") into machine code, and it may potentially be compiled into different architectures (or "dialects") of machine code to suit different computers. Each translation will be a unique version of the program, in the same way that each translated book is a unique version of the original novel. To take the analogy further, if I was fortunate enough to have written the first Harry Potter novel, it may be the case that I wouldn't understand the language into which it is translated. Thus I could be given the Latvian version of the novel, and I could reasonably surmise that it tells the same story as the British English one, but I would be unable to read it. In the same sense, the version of my program that has been compiled into machine code might be impossible for me to read: it is said to be "machine-readable", in that the computer can understand it, but it is far from easily readable for humans.

An interpreted program is stored in a human-readable form. When the program is executed, an interpreter modifies the human-readable content as it is run. This is analogous to the role that a human interpreter performs. For example, rather than translating the British English version of Harry Potter into Latvian and then providing the Latvian version to someone who understands the language, (as per compilation), we could hire a translator who knows both British English and Latvian. The translator may choose to read each line from the British English novel, translate each line (one at a time) into Latvian, and, as each line is translated, relate it to the listener. The computer interpreter performs the same function: it reads an instruction in one programming language, translates it into machine code, and then executes the machine code version. Once that instruction is out of the way, it moves along to the next, performing exactly the same task, in much the same way that the interpreter of the Harry Potter novel would move on to the next line once the first has been related. Unlike compiled programs, at no point is a complete, discrete, machine code version of the program produced: at any point in time only a small number of instructions exist as machine code versions, and these will be removed from the computer's memory when they are no longer required.

There are advantages for both types of software development. As a generalization, compiled programs are faster to run but slower to develop. Compiled programs often run faster because the computer only needs to execute the previously translated instructions. In interpreted languages, every time the program is run the computer also needs to translate each of the instructions. This translation causes a delay, slowing the execution of the program. On the other hand, interpreted languages are often written in a smaller time frame, because the languages are simpler and the whole program does not need to be compiled each time a new feature is bug tested.

Features of both approaches are discussed in more detail later.

Levels of programming language[edit | edit source]

Programming languages are described in levels. Low-level programming is close to machine code, high-level programming is closer to natural languages. At the most basic level (or "lowest level") is assembly language. This language is just a direct translation of the binary instructions the computer executes—each assembly language instruction directly relates to one in machine code. Thus just as every kind of processor architecture has its own machine code, each processor architecture also has its own assembly language.

Here's how to add two numbers in MIPS assembly:

LUI  R1, #1
LUI  R2, #2
DADD R3, R1, R2

This just did the calculation 1 + 2 = 3. Very roughly, the first two lines load the numbers "1" and "2" into the computer's memory, and the third instruction tells the computer to add the values together and store the result.

As you can see from the example, assembly language is quite dissimilar to natural languages. Higher level languages both get closer to natural languages and provide more efficient methods of expressing the instructions—thus to implement a given feature for a program in assembly language almost always requires more code than implementing the same feature in a higher level language such as C or C++. Assembly language gives the programmer the ultimate in flexibility and performance, at the expense of complexity and development time.

High-level languages look more like natural language with mathematical operations thrown in. These languages require more translation before the computer will understand them, but they are much easier to write. Here's what the same program might look like in a high-level language:

x = 1 + 2;

The variation between different programming languages can be quite extensive. Traditionally, the first program a programmer writes in a new language is Hello World—a simple program that outputs the phrase "Hello world!" (or a variation thereof) to the user. As will be clear from the following examples, even this seemingly simple task can be expressed in many different ways depending on which language is being used. For example:

In C:

#include <stdio.h>

int main() {
  printf("Hello World!\n");
  return 0;
}

In BASIC:

10 PRINT "Hello world!"
20 END

In Java:

public class HelloWorld
{
  public static void main(string[] args)
  {
    System.out.println("Hello world!");
  }
} 

In Pascal:

program HelloWorld;

begin
  writeln('Hello World');
end.

In Perl:

#!/usr/bin/perl
print "Hello World!\n";

In Python:

print("Hello World!")

Of these examples, C is considered to be the "lowest level" language, which is to say that it is the closest to machine code. The others are considered to be higher level languages, and thus are (generally) closer to natural languages. Some of the complexity you can see, especially with Java, derives from the way the programs are structured: Java is an object-oriented language, while Pascal, to pick but one, is a procedural language. However, this is something that we will need to address later.

  • Activity
    • Discuss why you think that there are so many different programming languages; why do you believe that a programmer would choose one over the other? It may be worth comparing this question to why there are so many different natural languages, and ask why someone who is multi-lingual might choose to use one natural language over another.

Translation schemes[edit | edit source]

Depending on the language used, and the particular implementation of the language used, the process to translate high-level language statements to actions may involve compilation and interpretation. Compilation is handled by a special program called a "compiler". Interpretation is a process in which language statements are realized into actions, and is conducted either by the CPU or by a special program called an "interpreter" or Virtual Machine.

Translating source code to machine code instructions is a process that is generally restricted to the compiled scheme and interpreted scheme. A common misunderstanding is that these schemes are properties of the language itself. Technically speaking, they are properties of the implementation of the language. Thus different implementations of the one language may produce compiled or interpreted code.

To further complicate matters, perfect classification of different schemes is difficult. To illustrate this, we will discuss the following examples below:

  • An implementation using multiple schemes
  • Multiple implementations using multiple schemes
  • An implementation changing their scheme
  • An implementation using a mixed scheme
  • Compiled Scheme. A program's source code is compiled to machine code/binary/executable. This executable is then interpreted by the CPU (in practice, most compilers do not compile directly to machine code, many compile to C then to assembly before recompiling to machine code, this process is usually done transparently). Example: GCC (C, C++), MSVC (C/C++), ghc (Haskell), fpc (Pascal), gpc (Pascal), V8 (JavaScript), Visual Basic 6 Native Code (BASIC)
  • Interpreted Scheme. A program's source code is interpreted line-by-line by a program called Interpreter, when there is a control structure . In practice, this scheme is very rarely used, because the interpreter had to reinterpret the source code every time the line is executed again and because it isn't very hard to cleanly move to any one of the mixed schemes. Example: PHP 3 and before (PHP), SpiderMonkey (JavaScript)
  • Mixed Scheme. This is a group of schemes that involves a compilation to an intermediate form but it is never compiled to machine code. Note that some languages implementation in this scheme (e.g. Sun Java) might selectively compile parts of its code to machine code (Just-In-Time compilation) for optimization purpose
    • Virtual Machine Scheme. A program's source code is compiled to a platform-independent byte-code file, which resembles machine code but cannot be run directly by the CPU. Instead, a special program called Virtual Machine would interpret the byte-code. Example: javac and JVM (Sun Java), CPython (Python), PHP 4 and above (PHP), Visual Basic 6 P-code (BASIC), various .NET languages (C#, VB.NET).
    • Abstract Syntax Tree Scheme. A program's source code is compiled to an in-memory object equivalent of the source code. Thus, the source code is only read once and the program is run from this tree. Example: perl (Perl), pugs (Perl)

Also note that all languages are theoretically able to be implemented in every scheme. However, each language fits a certain scheme better than other schemes.

Some shortsighted programmers consider interpreted languages to be inferior because they are slower compared with compiled languages. In fact, because of their dynamic nature, interpreted languages can be substantially more flexible, easier to learn, and faster to develop—sometimes you can have your program weeks sooner for the small price of being a fraction of a second slower; especially when you are using the fast computers of the modern age. In some limited cases, an interpreted language's core functions might be so highly optimized to the extent that an average quality program in compiled language could be slower than one written in interpreted language. In other cases (like applications using a database, or network access) a majority of your application's time is spent waiting to access data or the network, and using a "faster" compiled language will not make much difference. In fact, the general rule is: don't choose a language because it's faster. Otherwise, you'd be better off using assembly language directly. Choose a language you're comfortable with, one that looks "sexy" to you. It will be your friend during your whole programming work so choose well. Try several languages out before making a definitive choice.

It would be nice to have one language to "rule them all"—but we don't. The fact is, many different languages exist for different purposes. There are a few languages that fulfill a wide variety of programming needs, but to encompass all of them, it would have to be such a huge undertaking and become cumbersome and segmented as programmers begin customizing the language to their needs. Truth be told, many languages have become customized thus and released as independent packages, simplified for a smaller audience and streamlined for efficiency in that area only. Lua is for example targeted for scripting in games. While it's usable from outside of this context, that specific context is where it will probably be the most efficient/adapted.

Why should we create computer programs?[edit | edit source]

What is it that makes computer programming useful or pointless? What are the goals, common problems, and solutions related with computer programming?

Optional reading: History of programming—what problems and tasks motivated the invention of programmable machines?

Writing a program without knowing what problem you are trying to solve is the equivalent of swinging a knife without knowing what it is that you want to cut.

Code with a purpose. This is how:

Basic goals of computer programming[edit | edit source]

When you are planning to create a computer program you should:

  • Ensure your program fulfills the need it is addressing.
  • Ensure that people can easily use your program.
  • Ensure that it is easy to understand, fix, and improve your program without a major time investment.
  • Enhance problem solving using logic.
  • Developing a high understanding of principles in programming.
  • Demonstrating basic concepts, and software applications.
  • Recognizing ethical working codes.

Common problems[edit | edit source]

These are very common mistakes programmers make:

  • Your program does not do a job any better than an available alternative program. At best, you're re-inventing the wheel.
  • Your program does not work as intended.
  • Your program is too complicated or too primitive to be useful to most people.
  • Other people, or yourself at a later time, can't understand the programming behind your program. This means your project won't grow.

Solutions[edit | edit source]

  • Utility and Usability
    • Survey the field to look for programs similar to what you intend to develop. Determine what you like and don't like about those programs. Figure out ways to improve those programs.
    • Find source code for similar programs to what you want to make if at all possible. Unfortunately, people often write a lot of inferior code and leave it floating around the Internet. Don't be one of those people. Only release useful, well-written programs.
    • Thoroughly plan what you want your program to do before you start working on it. One way to do this is to make a flowchart of what you want your program to do.
    • Aim towards the maximum functionality your program can achieve while maintaining minimal complexity. Think of the iPod, matchstick, doorknob and television. They are simple yet effective.
    • If possible, talk to people who might use your program to get an idea of what features they would want or expect.
    • Remember the 80–20 rule: 20% of the features in most programs are used 80% of the time. If you can successfully identify even a few of the seldom used features and leave them out, you can save yourself some time and frustration.
  • Maintainability
    • Make it easy for someone to look at your program's source code and understand what's going on.
    • Use comments when you're doing somewhat complicated things in your program, or when in doubt.
    • Always choose readability of your code over memory performance. Though in the short run this may not always be the best choice, in the long run, the performance can be improved if the code is easy to understand.
    • Use simple, easy-to-understand names for variables. The convention is that a variable should be written variable_name or variableName.
    • Look at examples of source code, and especially source code intended as an example of proper programming form. Recognize the structure they're using to make things simple.

Make programming easy for you and everybody else[edit | edit source]

Computers and computer programs are intended to make our lives easier, not more complicated.

Programmers have figured out a very good way of doing things: Free Software.

Free Software is software written by an author who has released that software with the source code, freely available for anybody to look at, modify, improve and implement into their own software. Just like wiki, this is an easy, efficient and useful way of doing things.

To make programming easier for all coders, you should:

  • Release under GPL GNU General Public License
  • Take advantage of Free Software to create more Free Software
  • Make your code readable and efficient

How does one become a programmer?[edit | edit source]

A programmer is generally tasked with producing a solution to a problem via a computer program, which can be reused as various particular instances of that problem arise. To accomplish this task, the programmer therefore must be able to understand the problem, derive an appropriate solution, and then be able to effectively express that solution in a computer programming language.

The first prerequisite to becoming a programmer is to develop a strong background in problem solving as is often required in fields such as mathematics and engineering. The ability to formally, concisely, and clearly state a problem and then derive its solution is a fundamental skill that precedes the process of any computer coding. The study of computer algorithms focuses on the generalized problem solving approaches often applicable in computer science. Techniques such as programming design patterns provide practical templates which often help frame solutions in an understandable, reusable manner. In general, there are many different classes of problems in computer science that require different techniques to solve effectively: having a strong repertoire of approaches aids in finding good solutions.

The second prerequisite to becoming a programmer is knowing a programming language with which to express a particular solution to a problem. A programming language is a tool, which practice will help hone into expertise. Learning multiple languages can often be helpful as the process will demonstrate the particular strengths of different approaches to solving particular classes of problems[1].

In both these areas, practice is essential. Exposure to a wide variety of problems and tools to solve them will increase your understanding of the field as whole, to judge for yourself the relative merit of techniques in the field, as well as to discover unique, new approaches to addressing the problems of computer science. As with the practice of any new skill, do not be afraid to make mistakes, to redo earlier work, or to stumble along the way to expertise.

Those unfamiliar with programming may think it is like a write-up, written linearly from beginning to end. That is not the case. Instead, it rather resembles sculpting.

Don't be afraid to make mistakes because you will make them—constantly and for the rest of your career. Even the best programmers make mistakes regularly (if they try to say differently then they are either lying or deluded). Making mistakes is really part of the programing process. Someone said, to find a solution to a problem you must partly solve the problem. So you'll probably end up rewriting parts of your programs several times. Part of what distinguishes better programmers is their ability to catch at least some mistakes before they become a problem. Nobody can catch them all without help. Well, nobody can catch them all : there is no bug-free program. Additionally, it is good coding practice to code the smallest modules of a program, one at a time. This provides an isolated testing environment. Just be sure that everything is planned carefully from the outset, so that the little modules are not "orphaned" from other modules that they need to run. Note that when this approach is used, care should be taken to lay the foundation first, making careful note of "dependencies" within the program; code modules upon which other modules rely first, and then code higher-level modules after the foundations have been thoroughly tested. Unit testing is a good programming practice. Some programming processes, such as XP (eXtreme Programming), recommend to write tests before actual code.

Learn to work with others in a team environment. People have different strengths and weaknesses and a good team can bring together people whose strengths and weaknesses are complementary. The members of a team can teach each other. Most importantly, teams can tackle problems that would be too big for an individual. Beware, though, teams are not easy to lead and you can easily make a project a disaster by just gathering as many people as you can. If you want to work with a team, come with a solid design and a strong vision of what the project should look like when it's finished. If other members of the team suggest new ideas, take them with consideration but do not fall in the feature-greed pitfall, e.g. your project is never finished because the final goal grows beyond all proportions. As the XP process suggest, make small and frequent releases, prefer running code to vague and abstract design. Also, be careful how you go about this, you may not always agree with your team and that's okay!

As you continue to practice the art, take some time here and there to learn about the underlying theory that makes up the field commonly referred to as Computer Science.

Footnotes[edit | edit source]

  1. For further information on programming languages and how they vary, see Scott, M. L. 2005 Programming Language Pragmatics. Morgan Kaufmann Publishers Inc.

2.Educational tool to discover the Programming