How Does Deep Learning Power Modern AI?

By Jessie Hobb On Feb 28, 2024

The impacts and power of generative AI were realized at scale in businesses last year. Now, every software company, supermarket brand, and business, even tangentially related to tech, appears to be building an AI-based solution of their own for 2024. But how much do we know about the foundations of these technologies and what they’re doing with our data behind the curtain?

How well is the AI black box we’re all learning to trust really understood outside of niche tech circles? What is there to know about something that is opaque by design? First, let’s start at the beginning.

AI, in a nutshell, is a broad group of systems and data brought together to learn a single niche task. In a large language model (LLM), this might be used to generate text based on recognizing patterns observed in a vast array of previously ingested content. All current AI, from OpenAI’s ChatGPT and DALL·E 2 to Adobe’s Sensei image generation tools, work on precisely this principle of learning to replicate a task by mimicking previous data inputs.

More General AI, technology capable of context switching to do any given task, is still thought to be more than a decade away by most experts. However, the average user could be forgiven for thinking the more generalized sci-fi-like technology is just around the corner based on marketing hype and social media commentary.

The more you know about today’s increasingly popular tools, however, the easier you can visualize how they’re going to evolve and change in the coming months and years. Far from science fiction magic, today’s generative AI is simply a well-executed evolution of machine learning (ML) technologies that have existed for decades before. So, what’s changed in the last five years?

Here, we’ll dive into the field of deep learning — one of the cornerstone technologies behind today’s ML technologies and the key to the capabilities of today’s AI revolution.

What Is Deep Learning?

Deep learning is a specific application of neural networks, which is itself a branch of machine learning. A neural network-based ML algorithm trains an evolving piece of software by mimicking the way part of an organic brain learns — using gradual refinement algorithms to progress an approximate solution ever closer to a suggested goal.

Within this discipline, the applications of deep learning algorithms are extremely varied and complex. New systems are being unveiled almost every other week to recognize images and speech and visualize patterns of data that would otherwise be missed. In fields such as healthtech, these technologies are proving to be revolutionary and literally life-changing.

Deep learning models have been trained to diagnose a range of diseases from medical imaging with spectacular success over the last 10 years. What they will be able to offer patients in the next decade through increased access, growing capability, and specialized technologies is mind-boggling.

“The power of deep learning is what allows seamless speech recognition, image recognition, and automation and personalization across every possible industry today, so it’s safe to say that you are already experiencing the benefits of deep learning,” Sajid Sadi, the VP of Research at Samsung, told Forbes.

To achieve these goals in a new project, engineers first have to build a neural network capable of learning within its domain. This is done by creating a large array of small, simple, and relatively isolated pieces of code to make minute transformations on pieces of input data. These pieces of code, called nodes, mimic neurons in the brain by making tiny transformations and passing their results onto the next through a large interconnected network.

Like neurons in the brain, a single software node is trivially simple and virtually useless. Even in small numbers of just hundreds of nodes, the transformation made to input data would be almost untraceable. It’s what happens when these nodes, and the crucial connections between them, work scale that the results of the system become truly remarkable.

How Does a Neural Network Work?

When individual nodes are wired together, their connections are weighted to give priority to some channels of communication and limit others. These connections act as synapses in the organic brain, determining how information is propagated and acted on as a whole.

Using artificial neurons and synapses, modeled as nodes and weighted connections, a neural network builds vast numbers of layers to mimic the scale and complexity of an organic brain — allowing it to perform similarly complex tasks. These nodes are split into three types with their own unique role in the system:

Input nodes: These nodes are responsible for simply taking in data from outside the system and passing it into the network.
Hidden nodes: These work in conjunction with each other in a series of layers to process data and make decisions on its accuracy.
Output nodes: At the outermost layer, output nodes provide the finalized data product back to the outside world.

Much like an organic brain, it’s the connections between nodes that prove critically important to the system’s success. The strength of an organic brain’s synapses plays the largest role in its overall capabilities. In instructing a neural network to learn, developers change these weights systematically after input is processed to measure the distance between the algorithm’s output and our goal.

Using a cost analysis function on a large set of training data, engineers can evaluate the network’s performance between iterations and tweak its learning to achieve its goals. What sets deep learning apart from other neural networks is the number of layers of hidden nodes it applies to its goals.

Variants of Deep Learning Algorithms

Convolutional Neural Network (CNN)

For applications involving computer vision, images, and video, a convolution neural network is most commonly used. These are ideally suited to detecting subsets of features and patterns within a larger data set (a raw image, for example) — allowing for object detection, facial recognition, and a host of related intelligent features.

Recurrent Neural Network (RNN)

In comparison, for fields such as natural language processing and speech recognition, recurrent neural networks are often used for their ability to work well with sequential data.

One of the features common across all deep learning algorithms is that they are notably more complex than other ML techniques. It’s this inherent complexity that allows deep learning algorithms to build a thorough working model of the tasks they aim to accomplish. However, as we’ll see, this is a feature that has significant engineering trade-offs to counter its groundbreaking advantages.

Deep Learning in Practice

Deep learning is capable of doing something relatively unique in the field of machine learning. Unsupervised learning, taking unstructured raw data to be categorized and differentiated entirely autonomously, provides a new way to work on vast data sets without requiring human classifications.

With the processing power and storage available to cloud systems today, the algorithms used to work on our data are now capable of learning in an increasingly real sense of the word.

Already used to enable speech recognition in noisy environments, make accurate handwriting recognition possible, and enable facial recognition features used in mobiles and smart homes, these systems are something we already take for granted on a daily basis.

Streaming giants rely on technology to recommend TV shows, games, and music based on the ways large numbers of users have navigated through their service before. In real-world use, users often find their predictions and recommendations eerily accurate.

Deep learning is a success story of tech. Unsupervised learning is one of its most recent ‘killer apps’. Used to build some of the biggest and most highly capable platforms around, it’s easy to imagine it as a ready-made solution ready to be plugged into any problem you can accurately describe. Yet, there are some downsides to this remarkably powerful tool.

Drawbacks of a Deep Learning Approach

The biggest pitfall of deep learning comes from the very nature of neural networks as AI black box solutions. Even when the system provides the right answer 99 times out of 100, it’s hard to trust any software fully when you can’t see its inner workings.

The high degree of opacity neural networks suffer from makes them difficult to recommend for applications that require extensive oversight or regulation. Medical and aeronautical applications are two key areas where this can raise some significant issues.

The biggest problem in this area is a phenomenon engineers define as dataset shift — where training data is poorly fitted to real-world examples or suffers from a fundamental bias. The result of data shift problems often shows up in unexpected behavior from algorithms deployed to real-world use cases.

Troublingly, this issue may not become apparent in early use or testing. Machine learning systems tend to fail silently. Problems arising from data shifts can lay dormant for months or years in use and may go entirely diagnosed as the system makes decisions with poorly understood reasoning.

Another issue that plagues deep learning comes from underspecification — a problem recently uncovered at Google. Like data shift, this issue causes hiccups in real-world performance that can remain unnoticed. Unlike dataset shift, the main driver behind this issue isn’t a lack of well-fitting training data but a poorly optimized cost function that doesn’t capture enough detail about the intended goal.

Changes in the algorithms output large enough to produce meaningful differences but small enough to ensure the cost analysis still produces a passing grade that can pass on to the resulting system and produce unintended consequences further down the line.

Solving Deep Learning’s Real-World Challenges

The issue common to both of these problems is the unknown and unknowable nature of current deep learning algorithms. When used in practice, it’s almost impossible to differentiate a good system from a bad one because we don’t have enough information about what’s going on underneath. But engineers are developing ways to overcome that hurdle.

The field of explainable AI is growing at a rapid pace to develop algorithms that can be better understood by humans. This would circumvent many of the biggest problems inherent to the field, but we still have a long way to go before fully explainable deep-learning algorithms are solving major medical problems. Today, the advice to engineers building deep learning systems is to test, test, test, and then test some more.

“The biggest, immediate takeaway is that we need to be doing a lot more testing,” Alex D’Amour, who led an industry-leading study into under-specification, told MIT Technology Review. “We are asking more of machine-learning models than we are able to guarantee with our current approach.”

Extensive and thorough testing, continuous integration, and thorough verification are a must when deploying deep learning systems to ensure success. Related fields such as synthetic data are helping engineers overcome these challenges by making greater volumes of higher data available through AI-enabled data synthesis.

The Future of Deep Learning

Deep learning technologies have proven to be extremely powerful tools that have made an impact in every field. From self-driving vehicles on the road today to modern advances in healthcare diagnostics and accessibility applications — it’s not a technology that can be easily dismissed.

Despite the potential pitfalls and drawbacks listed here, this is a field worthy of significant investment and investigation. If 2023 was the year of GPT, then 2024 certainly marks the start of a decade of powerful, explainable, and highly robust AI systems powering tomorrow’s software solutions.