Load model's weights from a variable instead of file

Hello everyone!

I am trying to modify the TensorFlow source code in order to add decryption functionality that will decrypt the encrypted saved_model.pb and its associated files (such as variables.data-00000-of-00001 and variables.index) and load them into memory without saving them back to disk. This process depends on the return value of the decryption function, which determines whether the content is loaded from a string or a stream.

I have successfully implemented this functionality for the saved_model.pb file by modifying mainly this file. However, I have been facing challenges in implementing it for the variables files. I am currently confused about how the variables are read in the code.

It seems that the processing of variables occurs in this section of the code (link):

  const string variables_directory =
      io::JoinPath(export_dir, kSavedModelVariablesDirectory);
  // Check for saver checkpoints in v2 format. Models exported in the checkpoint
  // v2 format will have a variables.index file. The corresponding
  // variables are stored in the variables.data-?????-of-????? files.
  const string variables_index_path = io::JoinPath(
      variables_directory, MetaFilename(kSavedModelVariablesFilename));
  TF_ASSIGN_OR_RETURN(
      bool variables_index_exists,
      internal::FileExists(Env::Default(), variables_index_path));
  if (!variables_index_exists) {
    LOG(INFO) << "The specified SavedModel has no variables; no checkpoints "
                 "were restored. File does not exist: "
              << variables_index_path;
    return OkStatus();
  }
  const string variables_path =
      io::JoinPath(variables_directory, kSavedModelVariablesFilename);

  // Add variables to the graph.
  Tensor variables_path_tensor(DT_STRING, TensorShape({}));
  variables_path_tensor.scalar<tstring>()() = variables_path;

  std::vector<std::pair<string, Tensor>> inputs = {
      {string(variable_filename_const_op_name), variables_path_tensor}};

  AddAssetsTensorsToInputs(export_dir, asset_file_defs, &inputs);

  RunMetadata run_metadata;
  return RunOnce(run_options, inputs, {}, {string(restore_op_name)},
                 nullptr /* outputs */, &run_metadata, session);

When I inspect the value of variables_path_tensor, it only shows ./variables/variables as the path, which is the prefix and not the full path (e.g., ./variables/variables.data-00000-of-00001).

Now, I am uncertain whether what I am trying to achieve is even possible. If anyone can suggest a way to store the content of the variable file in a variable (after decrypting the file) and pass it to the rest of the code, I would greatly appreciate it. Since I am also new to C++.

Those are the locations where the varibales are proccessed I think: 1234

Thanks a lot in advance.
Mo

Ok looks like the reading happens at this part of the file tensor_bundle.cc:

  // Open the data file if it has not been opened.
  io::InputBuffer* buffered_file = data_[entry.shard_id()];
  if (buffered_file == nullptr) {
    std::unique_ptr<RandomAccessFile> file = nullptr;
    TF_RETURN_IF_ERROR(env_->NewRandomAccessFile(
        DataFilename(prefix_, entry.shard_id(), num_shards_), &file));
    buffered_file = new io::InputBuffer(file.release(), kBufferSize);
    // The InputBuffer and RandomAccessFile objects are both released in dtor.
    data_[entry.shard_id()] = buffered_file;
  }
  CHECK(buffered_file != nullptr);

I will see if I can read the content to a variable (e.g. string), do my stuff, and then pass the variable to the rest of the code.

I think I solved it. I implemented a new class derived from RandomAccessFile, which reads data from string instead of a file.

class StringRandomAccessFile : public RandomAccessFile {
 public:
  StringRandomAccessFile(const std::string& content)
      : content_(content), size_(content.size()) {}

  Status Read(uint64 offset, size_t n, StringPiece* result,
                   char* scratch) const override {
    if (offset >= size_) {
      return errors::OutOfRange("Offset is out of range");
    }

    size_t bytes_to_read = std::min(n, size_ - offset);
    *result = StringPiece(content_.c_str() + offset, bytes_to_read);
    memcpy(scratch, result->data(), result->size());

    return Status::OK();
  }

 private:
  std::string content_;
  size_t size_;
};

and at the part where the file is read, I changed it to this:

  // Open the data file if it has not been opened.
  io::InputBuffer* buffered_file = data_[entry.shard_id()];
  if (buffered_file == nullptr) {
    std::string file_content;
    std::string file_path = DataFilename(prefix_, entry.shard_id(), num_shards_);
    TF_RETURN_IF_ERROR(ReadFileToString(Env::Default(), file_path, &file_content));
    std::unique_ptr<RandomAccessFile> file(new StringRandomAccessFile(file_content));
    buffered_file = new io::InputBuffer(file.release(), kBufferSize);
    // The InputBuffer and RandomAccessFile objects are both released in dtor.
    data_[entry.shard_id()] = buffered_file;
  }

I will test it and see if everything works as expected, I will confirm the solution.