Tensorflow probability - Bayesian hierarchical model

I have a simple hierarchical model, currently developed in PyMC3 as follows:

a_b =  np.array([[3, 4, 6, 1, 5, 2, 9, 7],
       [1, 2, 8, 4, 5, 3, 9, 6],
       [2, 2, 3, 1, 5, 5, 9, 8],
       [2, 1, 8, 2, 9, 3, 8, 8],
       [2, 4, 9, 1, 4, 3, 5, 5],
       [1, 2, 9, 1, 3, 5, 5, 4]])

a_w =  np.array([
      [ 7, 6, 4, 9, 5, 8, 1, 3],
      [ 9, 8, 2, 5, 4, 5, 1, 3],
      [ 8, 8, 5, 9, 5, 5, 1, 2],
      [ 8, 9, 2, 8, 1, 8, 2, 2],
      [ 8, 6, 1, 9, 6, 7, 4, 4],
      [ 9, 8, 1, 9, 7, 5, 5, 6]])

aw_tc = np.sum(a_w, axis=1)
ab_tc = np.sum(a_b, axis=1)
dm_no, c_no = a_w.shape

with pm.Model() as BayesianBWM:
    w_star = pm.Dirichlet("w_star", a=0.01*np.ones(c_no))
    gamma_agg = pm.Gamma("beta_agg",alpha=0.01, beta=0.01)
    for i in range(dm_no):
        globals()[f"w{i}"]  =  pm.Dirichlet('w'+str(i), a=gamma_agg*w_star)
        globals()[f"aw{i}"] = pm.Multinomial('a_w'+str(i),n=aw_tc[i], p=globals()[f"w{i}"], observed=a_w[i,:])
        globals()[f"ab{i}"] = pm.Multinomial('a_b'+str(i),n=ab_tc[i], p= 1 / globals()[f"w{i}"], observed=a_b[i,:])

As you see, we have two observed matrices, a_w and a_b, and for each row of these matrices, we learn a w, and finally based on these w’s, we learn w_star.

I have difficulties implementing this in Tensorflow JointDistributionSequential. I want to develop it purely in TF probability and not using other high level libraries such as Edward2.

Any help is highly appreciated!

/cc @Christopher_Suter

The code below is the spiritual TFP equivalent to your PyMC model. I used JointDistributionCoroutine instead of Sequential, since it’s a bit easier to write for loops therein. There’s a bug in the turnkey sampling routine at present; I’ve raised the issue with the team and will circle back when we have a solution. Meanwhile, this is the rough idea!

a_b =  np.array([
       [3, 4, 6, 1, 5, 2, 9, 7],
       [1, 2, 8, 4, 5, 3, 9, 6],
       [2, 2, 3, 1, 5, 5, 9, 8],
       [2, 1, 8, 2, 9, 3, 8, 8],
       [2, 4, 9, 1, 4, 3, 5, 5],
       [1, 2, 9, 1, 3, 5, 5, 4]]).astype(np.float32)

a_w =  np.array([
      [ 7, 6, 4, 9, 5, 8, 1, 3],
      [ 9, 8, 2, 5, 4, 5, 1, 3],
      [ 8, 8, 5, 9, 5, 5, 1, 2],
      [ 8, 9, 2, 8, 1, 8, 2, 2],
      [ 8, 6, 1, 9, 6, 7, 4, 4],
      [ 9, 8, 1, 9, 7, 5, 5, 6]]).astype(np.float32)

aw_tc = np.sum(a_w, axis=1)
ab_tc = np.sum(a_b, axis=1)
dm_no, c_no = a_w.shape

@tfd.JointDistributionCoroutine
def model():
  w_star = yield tfd.Dirichlet(concentration=1. * np.ones(c_no, dtype=np.float32), name='w_star')
  beta_agg = yield tfd.Gamma(concentration=2., rate=2., name='beta_agg')
  for i in range(dm_no):
    w = yield tfd.Dirichlet(concentration=beta_agg * w_star, name=f'w{i}')
    aw = yield tfd.Multinomial(total_count=aw_tc[i], probs=w, name=f'aw{i}')
    ab = yield tfd.Multinomial(total_count=ab_tc[i], probs = 1 / w, name=f'ab{i}')

pin_values = dict(([
  (f'aw{i}', a_w[i, :]) for i in range(dm_no)
] + [
  (f'ab{i}', a_b[i, :]) for i in range(dm_no)
]))
pinned = model.experimental_pin(**pin_values)

results = tfp.experimental.mcmc.windowed_adaptive_hmc(
    n_draws=100,
    joint_dist=pinned,
    num_leapfrog_steps=10,
    n_chains=100,
    num_adaptation_steps=300)
2 Likes

The fix is in, and should be in tfp-nightly tomorrow: Fix bug in windowed_mcmc related to constrained distributions having … · tensorflow/probability@956d09f · GitHub

After that, the code I snipped above should work. Please let us know if you have any more issues!

2 Likes

Thanks a lot for taking the time, Christopher!

I first updated tfp-nightly, executed the code, and got the following error! I do not know if I must stay longer so that the fix in tfp-nightly will be available!

Here is the error (only the error and not the traceback):
ValueError: Input event_shape does not match event_shape_in ([7] vs [8]).

HI @Majid_Mohammadi – the error you see is what the commit I mentioned should fix. I think you just tried it a little too soon. The new tfp-nightly package goes out around 08:00 UTC. I think you will also need a couple of amendments to the code I wrote:

    w = yield tfd.Dirichlet(concentration=beta_agg[..., tf.newaxis] * w_star, name=f'w{i}')

will make sure the shapes align correctly. And you’ll also want to wrap the hmc call with tf.function:

results = tf.function(lambda: tfp.experimental.mcmc.windowed_adaptive_hmc(
    n_draws=100,
    joint_dist=pinned,
    num_leapfrog_steps=10,
    n_chains=100,
    num_adaptation_steps=300))()

This will ensure the sampler is fully compiled and should reduce run time.

1 Like