Artificial intelligence (AI) is simultaneously portrayed as the harbinger of the future and a fantastical notion relegated to science fiction. The truth is, AI is already interwoven into our daily routines. However, the term “AI” itself is an umbrella encompassing various approaches, and understanding these nuances is crucial.
When Google DeepMind’s AlphaGo program triumphed over Go master Lee Se-dol, headlines proclaimed the victory of AI, machine learning, and deep learning. While all three terms were invoked to explain AlphaGo’s success, they represent distinct yet interconnected concepts.
Imagine AI, machine learning, and deep learning as a set of nested circles. AI, the oldest and most encompassing idea, forms the largest circle. Machine learning, a subsequent development, resides within AI. Finally, deep learning, the engine driving today’s AI revolution, sits at the very center, nested within both AI and machine learning.
The Evolution of AI: From Promise to Present Powerhouse
The concept of AI has captivated imaginations and fueled research labs since its formal inception at the Dartmouth Conferences in 1956. For decades, AI’s trajectory swung wildly between utopian visions and dismissal as unrealistic hype. Until around 2012, the reality was a mixture of both—limited practical applications alongside immense theoretical potential.
However, the landscape dramatically shifted, particularly after 2015, marking an explosive resurgence of AI. This boom is largely attributed to the widespread availability of powerful GPUs (Graphics Processing Units), which significantly accelerated parallel processing, making it faster, cheaper, and more efficient. Concurrently, the explosion of Big Data – vast amounts of digital information including images, text, and transactional data – provided the fuel AI algorithms needed to learn and evolve.
Let’s trace the journey from AI’s somewhat underwhelming past to its current transformative power, driven by breakthroughs in deep learning.
Artificial Intelligence: Mimicking Human Cognitive Abilities in Machines
Back in that pivotal 1956 conference, the founding aspiration of AI was to create complex machines capable of replicating human intelligence. This ambitious goal is often termed “General AI”—envisioning machines with human-level consciousness, reasoning, and sensory capabilities. General AI, the realm of science fiction icons like C-3PO and the Terminator, remains largely theoretical. Despite decades of research, achieving true General AI is still a distant prospect.
In contrast, “Narrow AI” represents the practical reality of AI today. Narrow AI focuses on technologies designed to excel at specific tasks, often surpassing human capabilities in those defined domains. Examples of Narrow AI in action include image classification systems powering platforms like Pinterest and facial recognition technology used by Facebook.
These Narrow AI applications demonstrate aspects of human intelligence, but how is this “intelligence” achieved? This leads us to the next layer: machine learning.
Machine Learning: Empowering AI Through Data-Driven Learning
At its core, machine learning employs algorithms to analyze data, learn from it, and subsequently make informed decisions or predictions about the world. Instead of relying on explicit, hand-coded instructions for every task, machine learning algorithms are “trained” using vast datasets. This training process allows the machine to autonomously learn how to perform specific tasks.
Machine learning evolved directly from the early AI research community. Over the years, various algorithmic approaches emerged, including decision tree learning, inductive logic programming, clustering, reinforcement learning, and Bayesian networks. However, these earlier machine learning techniques fell short of achieving General AI and even struggled to deliver consistently reliable Narrow AI applications.
To delve deeper into deep learning, explore episode 113 of our AI Podcast featuring NVIDIA’s Will Ramey.
The AI Podcast · Demystifying AI with NVIDIA’s Will Ramey – Ep. 113
For many years, computer vision emerged as a promising application area for machine learning. Yet, early computer vision systems still required extensive manual coding. Developers would painstakingly create hand-coded classifiers, such as edge detection filters to identify object boundaries, shape detection algorithms to recognize geometric forms, and character recognition to decipher text. These hand-crafted classifiers were then integrated into algorithms aimed at interpreting images and “learning” to recognize objects like stop signs.
While functional, these early systems were far from perfect. Performance suffered significantly under challenging conditions, such as fog or partial obstructions. The inherent limitations of these brittle, error-prone systems explain why computer vision lagged far behind human capabilities until recent breakthroughs.
The crucial missing pieces were time and the right learning algorithms – enter deep learning.
Deep Learning: Revolutionizing Machine Learning with Neural Networks
Another algorithmic approach originating from the early days of machine learning, artificial neural networks, experienced periods of both promise and neglect over decades. Neural networks draw inspiration from our understanding of the human brain’s biological structure – the complex web of interconnected neurons. However, unlike biological brains with flexible neural connections, artificial neural networks are typically structured in discrete layers with defined connections and data flow directions.
In a typical deep learning process for image recognition, an image is divided into tiles, which are then fed into the first layer of the neural network. Each neuron in the first layer processes its input and passes the information to the subsequent layer. This layered processing continues until the final layer produces the desired output.
Each neuron assigns a “weight” to its input, representing its relevance to the task. The final output is determined by aggregating these weighted inputs. Consider the stop sign example again. A neural network analyzing a stop sign image examines various attributes – its octagonal shape, red color, distinctive letters, standard size, and motion (or lack thereof). The network’s goal is to determine if the image is indeed a stop sign. It generates a “probability vector,” essentially a highly informed guess based on the weighted attributes. For example, the system might express 86% confidence that it’s a stop sign, 7% confidence it’s a speed limit sign, and 5% confidence it’s an unrelated object. The network architecture is then informed whether its prediction was correct.
Historically, neural networks were largely dismissed by the AI research community despite their early origins. They had shown limited “intelligence” and were computationally demanding, rendering them impractical for most applications. However, a dedicated group led by Geoffrey Hinton at the University of Toronto persevered. They successfully parallelized neural network algorithms for supercomputers, demonstrating the potential of the concept. However, it was the advent of GPUs that truly unlocked the promise of neural networks.
Returning to the stop sign example, initially, the neural network is likely to make numerous errors during its training phase. Effective training requires exposing the network to vast quantities of data – hundreds of thousands, even millions, of images. Through this extensive training, the weights assigned to neuron inputs are refined with increasing precision until the network can accurately identify stop signs in diverse conditions – fog, sunlight, rain, etc. At this point, the neural network has effectively “taught itself” to recognize a stop sign, or a human face (as used in Facebook’s facial recognition), or even a cat, as demonstrated by Andrew Ng at Google in 2012.
Ng’s pivotal contribution was scaling neural networks dramatically – increasing the number of layers and neurons – and training them on massive datasets. In his groundbreaking work, Ng utilized 10 million images from YouTube videos. This “deepening” of neural networks gave rise to the term “deep learning,” emphasizing the multiple layers within these networks.
Today, deep learning-powered image recognition in certain scenarios surpasses human accuracy. This extends beyond simple object recognition like cats to complex tasks like identifying cancer indicators in blood samples and tumors in MRI scans. Google’s AlphaGo learned the intricate game of Go by playing against itself countless times, continuously refining its neural network through self-play.
AI’s Promising Horizon, Fueled by Deep Learning
Deep learning has been instrumental in realizing the practical potential of machine learning and, consequently, the broader field of AI. Deep learning’s ability to decompose complex tasks opens doors to a wide array of machine-assisted applications, many of which are becoming realities. Driverless cars, enhanced preventative healthcare, and more personalized recommendation systems are either already here or rapidly approaching. AI is no longer a futuristic fantasy; it’s the present and the future. With deep learning as its driving force, AI may even approach the science fiction visions we have long entertained. A helpful C-3PO-like assistant? Yes, please. A Terminator? Perhaps we can do without that one.
The AI Podcast · Demystifying AI with NVIDIA’s Will Ramey – Ep. 113