What is supervised learning?
Supervised learning is a class of learning which uses data entered by humans into a computer/machine in the form of input-output. The term “supervised” stands for human supervision; you can think of it as a teacher who will supervise the work and tell the computer how much its prediction of the output from the inputs fits the (already known) output. Indeed we know that this input and that input would lead to the same given output.
For example, you would like the computer to “decrypt” handwriting of many human candidates. You give the computer thousands of scanned images from human handwritings and after the computer processed the images, you tell him “this is a “A”. So the computer has to learn that from the handwritten images you gave him, this is a “A”.
Then you give the computer another set of data, the computer has to guess and you “tell” him that the output is now a “B”, etc. You do that until let’s say letter M, with thousands of handwriting cases for each letter, until the computer gets very good at it. From the N on, you want to test the computer, based on what it has previously learned from previous letters, by asking it to guess which letters this new handwriting represents. If the computer is doing good enough (i.e. its level of response is very accurate and it makes a very small percentage of errors), then you can show the computer new handwritings to decrypt.
This knowledge on which the computer gets trained is called a training dataset or a training set. The training part supervised is the learning process. To be more accurate, this is actually an algorithm written by humans, also called a code or a computer program, which learns from this training set. The algorithm is written in the computer system, so by analogy we say that the computer “learns”.
The computer “learns” via its learning algorithm. This algorithm tells him how to “learn” and from where, in this case from the training dataset where data in the form of inputs and outputs are stored.
The goal for the computer is to infer (i.e. deduce) from the input variables entered in the training set the correct output variable also entered in the training set. From the input variables to the output variable there is a function called a mapping function and the task of the computer / of its learning algorithm is to come as close as possible to the best approximation of this function, i.e. to the best inference of the output based on the inputs. When this is achieved, i.e. when we consider that the performance of the learning algorithm is good enough, meaning that the approximation to the mapping function is good enough, the learning process ends. Then comes the testing part.
The testing part consists in testing the computer with a new input, i.e. an input which has not been previously entered in the training set. Once the computer has learned, then it is able to make very good guesses, called predictions, on what would be the outcome of a new input. Indeed the goal of the computer, based on what it learned from the training set, is now to apply the best approximation of the mapping function he learned during its learning process to the new input and infer an output.
These data collected into the training set consist of knowledge of the same kind collected into a database, for instance all the legal cases since 1950. Another example would be if we enter millions of chess parties played since 100 years in a computer, with all possibilities of moves leading to a win or to a loose, this would allow the computer to analyze from this database what is the most reliable move to make for a given position (while playing) in order to win. Another example would be in the heathcare sector. Imagine a database of every patient diagnosed for cancer all over the world, for the past 50 years: their vitals’ rate (e.g. blood pressure, rate of antibodies or concentration of iron in their blood, …), main symptoms ahead of the disease (e.g. headaches, tiredness, sour throat, …), type of cancer, time till the development of the cancer, treatment, outcome of the treatment, outcome of the cancer, in how long, and all the associated pertinent information related to the disease.
Now imagine such database entered in a single computer. The latter would then become much better than any doctor in the world to diagnose, cure or even prevent specific types of cancer. Why? Let’s come back to what I wrote previously. The inputs, the number of variables of each patient diagnosed for cancer, might be different depending on the patient condition (e.g. whether she has diabetes, her previous history of cancer, number of cancer cells, stage and evolution of the disease, treatment applied, etc.) and lead to different outputs (cancer cured or not cured, in how long, etc.). The treatment for curing this cancer might have been different and adjusted according to each patient’s special condition and might have worked or not. The computer would then learn all these associations (e.g. patients’ condition, type of cancer, treatment, outcome, …), based on millions and millions of cases, therefore learn how to come up with the best outcome in terms of treatment depending on a particular patient condition, for instance making the best guess on which treatment between chemotherapy and surgery would be more appropriate and the most successful to a new patient recently diagnosed for lung cancer but also suffering, for example, of diabetes. To be able to achieve such performance, it means that the computer learned how to come very close to the mapping function and approximate the best output based on so many inputs (the number of patients diagnosed for cancer all over the world for the past 50 years, in our example). This is already applied in Korea, where doctors use an AI supercomputer as a “medical doctor colleague” loaded with a database of more than 12 millions research papers and cancer medical cases to help new patients diagnosed for cancer to have the right treatment and with the best outcome. Of course, doctors discuss this new case together and based on the outcome of the supercomputer, came up with a decision.
Here you have reached the end of this post. Hope you liked it. Do not hesitate to share any comment if you wish or ask any question! To read more about “Unsupervised Learning” you can click here.