**Introducing Machine Learning**

Dino Esposito & Francesco Esposito — Microsoft Press, January 2020

Back in 1900, German mathematician **David Hilbert **set a number of challenges for fellow mathematicians to take in the new century. The leitmotif was the question *Can all mathematical statements be expressed and manipulated through a set of well-defined rules?*. The goal of Hilbert was finding a way to formalize all known mathematical reasoning in much the same way **Euclid of Alexandria** did so well for his time (between 4^{th} and 3^{rd} century BC). The Hilbert’s ultimate purpose was getting a set of axioms that could generate all mathematical statements.

In 1931, **Kurt Gödel **demonstrated a couple of theorems of mathematical logic that the community interpreted as a negative answer to the Hilbert’s fundamental question. In particular, the two theorems lay the foundation of the following statement:

In any formal system, expressive enough to model the arithmetic of natural numbers, there is at least one undecidable statement that evidence proves true but that can’t be proven true or false within the axioms of the system.

In addition, **Gödel **proved that even though one axiomatically assigns a value of *true *or *false* to the undecidable statement, then any further reasoning will indefinitely lead to another undecidable statement.

Why are Gödel’s theorems so crucial to formal reasoning and, following up, to software as an artificial form of intelligence?

For one thing, Gödel’s incompleteness theorems draw a line beyond which mathematical logic can’t just go: there are things that nobody can prove using formal reasoning. On the other hand, though, Gödel’s theorems demonstrate that **within the limits of a consistent formal system **any reasoning can always be expressed as a set of formal transformation rules and then, in some way, mechanized. This second aspect is extremely relevant for artificial intelligence as it **sets the theoretical foundation for mechanical, computer-based reasoning**.

Great, but how do we formalize the human thought?

In history, there were many great examples of calculating machines concretely built, or just devised, by polymaths and scientists. One was devised by **Leibnitz **in 17^{th} century and a more detailed one was theorized by **Charles Babbage** in the 19^{th} century. In the modern era, ancestors of today’s computers were the cypher and code breaking machines employed during the second world war. Examples are: **Enigma**, its breaking counterpart named **The Bombe** (built with a great contribution from Alan Turing), the German Army’s **Lorenz **machine and the British giant machine called **Colossus **which ultimately broke it. **ENIAC **is the name of another machine built in the United States in the final days of the second world war and crucial to the Manhattan project. All these machines were based on the theoretical foundation laid by the **Church-Turing** thesis—the practical spinoff of the Gödel’s incompleteness theorems. The development of ENIAC, in particular, was led by another big name of computer science–**John von Neumann**.

Not coincidentally, in fact, Alan Turing and John Von Neumann are considered the fathers of what we today call Artificial Intelligence. Imagine now, for a second, to be in the shoes of either of those two great men.

You’re in the 1950s and you know you can build machines to compute anything you can express through a consistent grammar of symbols, not just calculating numbers from numbers. You would probably feel like god; you would probably foresee somewhere far ahead of you, but clearly identifiable, a machine that can behave in much the same way humans do. Then you probably wonder the crucial question: *Can machines think*?

Alan Turing devised a test to determine if machine could think. He imagined a teletype conversation between a human, a machine and a human judge. If the machine could answer questions and convince the judge it was a human, then the machine could be said able to think. Over the years, many contested the effectiveness of the Turing test. In particular, **John Searle** (Professor Emeritus of the Philosophy of Mind and Language at Berkeley) noted that anybody with a proper dictionary and instructions written in his own language could probably provide an answer in, say, Chinese that, to a Chinese judge, could make full sense. Does that mean that the answerer (human or machine) understands Chinese?

The Searle’s point about thinking machines is that machines can merely process symbols according to rules, but this is **not enough to reach the peaks** of consciousness, cognition, perception of humans and not even their language skills. According to Searle, language is more than plain symbol manipulation and the “more” is just what defines the human thought. Computers can just compute but can do it very carefully and fast to the point of **being even better than humans **in some specific tasks. In accordance with Searle’s view, today machine learning systems are just expected to operate in highly controlled scenarios under the realm of business rules and data patterns and are challenged to anticipate issues and events. Think for example of systems to predict hardware faults or detect financial fraud. All these modern systems may perform very well in their contexts, but **our human idea of “thinking” **requires (much) more computing power to be truly (although partially) replicated.

And it also requires a theory that just doesn’t exist yet and nobody has (known) idea of how to find. No matter doomsday scenarios that non-technical people sell to media.