Hi all. I am working on an genetic optimization algorithm where training is bottlenecked due to needing to evaluate multiple sets of model weights. The model architecture and the batch of data is all the same. The model itself and the batch of data is also relatively small such that memory shouldn’t really be a concern here. What I want to do is evaluate X sets of weights in parallel. I believe this should in theory be possible given that what I am trying to do can more or less be thought of as a sparse network where each “subnetwork” only shares the input layer. Since sparse nodes already exist like with CNN, I just need to be able to define which nodes are connected. That I can process a batch of weights like I would a normal model.
Does anyone have any ideas? TIA
It is difficult to understand the issue with the given information. Could you please share reproducible code to replicate and understand the issue like what is the model architecture, dataset shape and type?
You can follow the Functional model’s plotting method to check the nodes or layers connectivity inside the model.
It is very crude code I threw together to try and do it myself (and not working) but this should illustrate the concept - import tensorflow as tffrom tensorflow.python.keras.layers import Layerfrom - Pastebin.com
Basically the goal here is that the multi/shared-weight model should produce 3 outputs in the same predict() that match the individual outputs from three separate instances of predict() using different weights.
for anyone with a similar use case, this is how I solved it - class MultiWeight(Layer): def __init__(self, units, num_sets): s - Pastebin.com
You just have to concatenate (layer-wise) each set of weights on axis=1 and biases you just add up.