For my bachelor's thesis I explored the combination of evolutionary computing with neural networks. Creating neural networks requires the design of
the networks architecture. While there are guidelines that help with this process, creating a good architecture involves a lot of trial and error.
Inspired by the human brain, I created a method that creates and adapts neural networks through evolution. This allows the network to shrink and grow based
on the networks needs. Additionally, previous research shows that partially connected layers (opposed to fully connected layers) might be beneficial and allows
for smaller networks. Together with individual skip connections, the partially connected layers allow for smaller, but more complex neural networks.
The results show that the method produces better or comparable predictions with smaller networks compared to previous research.
All code was written in C#, without the need for deep learning frameworks such as PyTorch and TensorFlow.
The image below shows the evolution of a network. The nodes on the bottom are the input nodes, while the nodes on the top are the output nodes.
Abstract
Designing the topology of a neural network can be a difficult task, as there is no reliable
method of producing such a network. Studies have shown that the topology and connection weights of a neural network can be obtained through evolutionary algorithms.
However, research has focused primarily on networks that generate real-valued output.
In this thesis, we ask the question if evolutionary algorithms can be used to produce
neural networks suited for multi-class classification problems. This is non-trivial, as
these kinds of networks generally feature more nodes and therefore are more complex
and harder to generate. We present an algorithm that generates and adapts such
neural networks using evolutionary algorithms. Results from testing the algorithm
on four different datasets show that the algorithm is capable of producing networks
for relatively simple datasets. The algorithm was, however, unable to produce useful
networks for the most complex dataset.