How to process sparse "subnetworks" in parallel

Ben_Arnao · October 31, 2023, 4:18am

Hi all. I am working on an genetic optimization algorithm where training is bottlenecked due to needing to evaluate multiple sets of model weights. The model architecture and the batch of data is all the same. The model itself and the batch of data is also relatively small such that memory shouldn’t really be a concern here. What I want to do is evaluate X sets of weights in parallel. I believe this should in theory be possible given that what I am trying to do can more or less be thought of as a sparse network where each “subnetwork” only shares the input layer. Since sparse nodes already exist like with CNN, I just need to be able to define which nodes are connected. That I can process a batch of weights like I would a normal model.

Does anyone have any ideas? TIA

Renu_Patel · November 21, 2023, 10:45am

Hi @Ben_Arnao

It is difficult to understand the issue with the given information. Could you please share reproducible code to replicate and understand the issue like what is the model architecture, dataset shape and type?

You can follow the Functional model’s plotting method to check the nodes or layers connectivity inside the model.

Ben_Arnao · November 21, 2023, 9:47pm

It is very crude code I threw together to try and do it myself (and not working) but this should illustrate the concept - import tensorflow as tffrom tensorflow.python.keras.layers import Layerfrom - Pastebin.com

Basically the goal here is that the multi/shared-weight model should produce 3 outputs in the same predict() that match the individual outputs from three separate instances of predict() using different weights.

Ben_Arnao · December 9, 2023, 4:31am

for anyone with a similar use case, this is how I solved it - class MultiWeight(Layer): def __init__(self, units, num_sets): s - Pastebin.com

You just have to concatenate (layer-wise) each set of weights on axis=1 and biases you just add up.