Neural Networks 101: An explainer

Date

May 19, 2025

Hot topics 🔥

AI & Tech

Contributor

Mario Grunitz

Summarize with AI:

Humans are able to train computers to think as we do partly through the use of neural networks. These complex systems allow machines to gather, identify and categorise data to make increasingly accurate predictions with each iteration.

“Thinking machines” have become an everyday reality responsible for shaping almost every aspect of our digital society. From the transformer architectures powering ChatGPT to the multimodal models enabling vision transformers, neural networks continue to revolutionise how we interact with technology.

But what are the benefits, drawbacks, and real-world applications for machines that can learn as humans do?

What are neural networks?

Neural networks – also known as artificial neural networks (ANNs) – are the powerhouse of deep learning algorithms, a subset of machine learning. As the name implies, neural networks are a series of algorithms that are modelled after the way biological neurons signal to each other in the human brain. They are designed to mimic our brain’s ability to recognise patterns and categorise information – only much faster.

To understand the information it receives and processes, neural networks assign items to categories – much like the human brain. When we learn something new, our brain tries to compare it to similar things to make sense of it. Neural networks function under the same principle.

For example, in the same way we know a salmon is a salmon partly because it is not a tuna (but is still a fish), neural networks try to achieve the same categorisation disciplines based on deterministic principles. Neural networks learn by doing, becoming more accurate with each iteration.

A brief history of neural networks

While it might sound futuristic, the idea of modelling machines on the human brain has been around for a surprising amount of time.

In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts developed a simple electrical circuit that compared neurons with binary thresholds to Boolean logic (0/1 or true/false statements). The concept of “threshold logic” was born.

Then, Donald Hebb’s The Organization of Behaviour (1949) proposed that neural pathways get stronger each time they are used – more so when they fire at the same time. This is known as Hebbian Learning, and is often referred to by the phrase “Cells that fire together, wire together”.

In 1958, building upon the work previously done by McCulloch, Pitts, and Hebb, Frank Rosenblatt added weights to the equation and developed the perceptron – a probabilistic model explaining a brain’s complex decision processes within a linear threshold.

In the 60s and early 70s, a lot of research went into concepts of Neural Nets and backpropagation – an intuition-based method that reduces significance to events the further back you go in the chain of events. In 1974, Paul Werbos applied this concept to neural networks. Backpropagation, together with Gradient Descent, forms the backbone of neural networks.

This lead to an acceleration in AI research and resulted in Kunihiko Fukushima developing the world’s first multilayered neural network in 1975. The watershed moment came in 2017 with the publication of “Attention Is All You Need”, introducing the transformer architecture that revolutionised natural language processing and became fundamental to modern AI systems like GPT models and Google’s Gemini.

How do neural networks function?

Neural networks are made up of node layers (or artificial neurons) that contain an input layer, multiple hidden layers, and an output layer. Each node has a weight and threshold and connects to other nodes. A node only becomes activated when its output exceeds its threshold, creating a data transfer to the next network layer.

Let’s use a feedforward network as an example:

When input layers are determined, they are assigned a specific weight that decides the importance of a variable – higher values contribute more to the output. Then, each input is multiplied by its weight and then calculated to find a value. Once this is completed, outputs are passed through an activation function which determines the real output. If the output is higher than a particular threshold, it activates the node to pass data to the next network layer – i.e. one node’s output becomes the next node’s input.

Because multiple layers of data are being processed, a hierarchy of related concepts and decision trees are created. This leads to numerous variations of one input being processed, as to answer one question, neural networks generate multiple leads/variations to source deeper related questions as a result.

Neural networks rely on the amount of data they receive. The data needs to be good, ‘clean’ data in order to make effective predictions and yield better results. The difference between machine learning and deep learning neural networks is that ANNs do not require their parameters to be defined and programmed by a human, the systems learn from exposure to big data – they train themselves.

Types of neural networks

There are different kinds of neural networks that have pros and cons depending on what you use them for. Although there are a lot more, here are the most common types of neural networks:

Convolutional Neural Networks (CNN)

CNNs are comprised of 5 layers: input, convolution, pooling, fully connected, and output. Each layer performs a specific function (connecting, summarising, and activating). CNNs are commonly used for object detection, image categorisation, and natural language processing. However, vision transformers (ViTs) now often exceed CNN performance on image segmentation and object detection tasks.

Recurrent Neural Networks (RNN)

RNNs use sequential data – like time-stamped information from an audio sensor – to determine its current input and output. Unlike common neural networks, all RNN inputs and outputs are codependent and rely on previous computations of each other to generate new ones. RNNs are used to solve temporal issues in speech recognition and language translations, though transformer models have largely superseded them in many applications due to their parallel processing capabilities.

Feedforward Neural Networks (FNN)

In FFNs, each perceptron in a layer is connected to the perceptrons of the next layer. Data is fed from one layer to the next moving forward only, meaning there are no feedback loops. FNNs are thus the opposite of RNNs.

Autoencoder Neural Networks

Autoencoders generate abstractions called ‘encoders’ from a set of inputs which they model themselves, making them unsupervised networks. They are designed to ignore irrelevant data and prioritise data that is most relevant.

Graph Neural Networks (GNN)

Graph Neural Networks have quietly become the driving force behind breakthrough achievements across various sectors. Companies like Uber, Google, Alibaba, Pinterest, and Twitter have shifted to GNN-based approaches in their core products, motivated by substantial performance improvements over previous architectures. GNNs excel at processing data represented as graphs, making them invaluable for social networks, recommendation systems, and molecular analysis.

Applications and advantages of neural networks

Neural networks are highly beneficial as they are capable of extracting meaning from complicated data and through this, they are able to detect trends and identify patterns often too complex for humans to decipher. They achieve this by learning by example and they do this really, really fast.

Deep learning is therefore highly beneficial if you have a lot of data intended to find specific use cases for, to derive multiple interpretations. It is particularly useful as it solves problems too complex for machine learning capabilities.

Data classification

Neural networks can classify data which is used in multiple applications across many industries, from banking to retail. Facial, image, voice and gesture recognition are the most common forms of classification and are used for various purposes: from autonomous vehicles and voice functionality to cyber security and identity verification. Multimodal models like OpenAI’s GPT-4V and Google’s Gemini can now process text, images, audio, and video simultaneously.

Data clustering

Neural networks are able to cluster data in large volumes to detect similarities. This is used for search when comparing documents, sounds, or images to similar items. It is also highly useful to detect anomalies and unusual behaviour outside the norm, used by banks to prevent fraud, for example.

Predictive analytics

When fed enough data, neural networks can establish correlations between current and future events to predict outcomes. This is achieved by running regressions between past and future correlations. The more data is received, the more accurate the predictions will become over time. This is useful to predict outcomes and prevent issues from occurring before they do.

From hardware breakdowns across industries such as transport and manufacturing (even data centres) to predicting possible patient health problems from data gathered from wearables and digital health profiles, predictive analytics help us act rather than react. Transformer-based models are now being applied to long-term time series forecasting with remarkable success.

Neural network challenges

But this power comes with a cost — computing power and resource cost. In order to get the full benefit of deep learning neural networks, you will need to have incredibly powerful computing machines, and a lot of money to get your hands on them.

This, coupled with a lot of time, as some neural networks take a few days to spew out (albeit impressive and highly complex) interpretations. AI’s environmental impact is a growing concern, with the energy consumption of training large models becoming a critical sustainability issue.

Additionally, generative AI systems are prone to errors, such as producing incoherent or untrue responses, which raises concerns about their reliability in critical applications. Issues like gender bias in text-to-image generators and the manipulation of chatbots for harmful purposes underscore the need for cautious and responsible AI development.

The future of neural networks

AI experts are constantly pushing the capabilities of deep learning neural networks to improve the capabilities and use cases. They are doing this by training increasingly larger neural networks with immense amounts of data, though the focus is shifting toward more efficient scaling approaches rather than pure brute-force expansion.

NeurIPS 2024 showcased transformer-efficient hybrids and logic-gate neural networks, highlighting the field’s move toward algorithmic efficiency, token selection, and hybrid architectures. The consensus has shifted from pure scaling to effective scaling: smarter data curation, better architectural priors, and targeted generalisation strategies.

Recently, neural networks have been combined with other AI tools and approaches to perform even more complex tasks. Neuro-symbolic AI combines deep learning capabilities with symbolic logic, ensuring outputs conform to physical laws or regulatory rules. This approach offers provable consistency for use cases like molecular design, physics simulations, and scientific research.

Deep Reinforcement Learning plants neural networks in a reinforcement learning framework which maps actions to rewards in order to achieve goals. This approach shows promise in breakthrough applications like protein design, where systems like RFDiffusion have achieved remarkable success in creating novel protein structures.

The future is indeed bright for deep learning neural networks. With more data our digital society produces, the more information this powerful technology is trained on to become increasingly accurate and useful. As we embrace the convergence of multimodal capabilities, transformer architectures, and neuro-symbolic approaches, neural networks continue to push the boundaries of what artificial intelligence can achieve while working harmoniously with human creativity and innovation.

SaveSaved

Summarize with AI:

Mario Grunitz

Mario is a Strategy Lead and Co-founder of WeAreBrain, bringing over 20 years of rich and diverse experience in the technology sector. His passion for creating meaningful change through technology has positioned him as a thought leader and trusted advisor in the tech community, pushing the boundaries of digital innovation and shaping the future of AI.

AI readiness assessment: what to evaluate before investing in AI

AI & Tech

What is AI-native? Meaning, principles and examples

AI & Tech

Outgrown n8n for AI work? Your next alternative, explained

AI & Tech

Choosing the right database for your application

7 types of Artificial Intelligence

Working Machines

An executive’s guide to AI and Intelligent Automation

Working Machines eBook

Learn more

Neural Networks 101: An explainer

What are neural networks?

A brief history of neural networks

How do neural networks function?

Types of neural networks

Convolutional Neural Networks (CNN)

Recurrent Neural Networks (RNN)

Feedforward Neural Networks (FNN)

Autoencoder Neural Networks

Graph Neural Networks (GNN)

Applications and advantages of neural networks

Data classification

Data clustering

Predictive analytics

Neural network challenges

The future of neural networks

Mario Grunitz

Related posts

AI readiness assessment: what to evaluate before investing in AI

What is AI-native? Meaning, principles and examples

Outgrown n8n for AI work? Your next alternative, explained

Choosing the right database for your application

7 types of Artificial Intelligence

Tags

Working Machines