TensorFlow Probability Multivariate HMM Input

Emre_Y · July 31, 2021, 2:25am

Is there a clear implementation of multivariate data into TFP’s distribution.HiddenMarkovModel? Despite repeated attempts I have yet to find any example in official documentation or publicly available repositories of the hidden Markov model in TensorFlow Probability being used with correlated data (Specifically for my use case, multiple time-series). I know of similar implementations in other libraries (Pyro, pymc3, etc.) however it would be preferable for my situation to stay in the TensorFlow environment.

Furthermore, going through the source code for the HMM, it does seem event_shape for ‘observation_distribution’ is utilized, but more in relation to num_steps than for the purpose of interpreting multivariate data?

Any help would be greatly appreciated

Bhack · August 4, 2021, 12:14pm

@markdaoust We don’t have a TFP tag and TFP subscribed member. Can you reach someone internally?

davmre · August 4, 2021, 3:04pm

It should ‘just work’ to build an HMM using an observation distribution with multivariate (vector, matrix, etc) events. If the observation distribution has event shape [d] at each timestep, then the HMM as a whole will have event shape [num_steps, d].

I threw together a quick example of fitting an HMM with multivariate normal emissions here (code also copied below):

import tensorflow as tf
import tensorflow_probability as tfp
from matplotlib import pylab as plt
tfb = tfp.bijectors
tfd = tfp.distributions

# Generate 'ground truth' data from a known HMM as test input:
true_initial_logits = [1., 0., 0.5]
true_transition_logits = [[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]
true_emission_locs = tf.Variable([[0., 0.], [2., 2.], [-2., -2.]])
true_emission_scale_trils = tf.eye(2)
true_hmm = tfd.HiddenMarkovModel(
    initial_distribution=tfd.Categorical(true_initial_logits),
    transition_distribution=tfd.Categorical(true_transition_logits),
    observation_distribution=tfd.MultivariateNormalTriL(
        loc=true_emission_locs, scale_tril=true_emission_scale_trils),
    num_steps=10)
print(true_hmm.event_shape)  # [10, 2]
ys = true_hmm.sample(500)
print(ys.shape)  # [500, 10, 2]

# Define trainable variables for HMM parameters:
num_states = 3
initial_logits = tf.Variable(tf.zeros([num_states]))
transition_logits = tf.Variable(tf.zeros([num_states, num_states]))
emission_locs = tf.Variable(tf.random.stateless_normal([num_states, 2],                                                       seed=(42, 42)))
emission_scale_trils = tfp.util.TransformedVariable(
    tf.eye(2, batch_shape=[num_states]),
    tfb.FillScaleTriL())
hmm = tfd.HiddenMarkovModel(
    initial_distribution=tfd.Categorical(initial_logits),
    transition_distribution=tfd.Categorical(transition_logits),
    observation_distribution=tfd.MultivariateNormalTriL(
        loc=emission_locs, scale_tril=emission_scale_trils),
    num_steps=10)
print(hmm.event_shape)  # [10, 2]

# Maximize the log-prob of observed samples:
losses = tfp.math.minimize(
    lambda: -hmm.log_prob(ys),
    num_steps=200,
    optimizer=tf.optimizers.Adam(0.1))