LLVM updates and bazel cache

Oh, I definitely agree that we need to go deeper still. I’m certain that removing monolith core:lib and core:framework and similar targets will help a lot, as well as cc_shared_library and separating kernels to smaller libraries. We’ll basically need a year-long roadmap to untentangle all of this and make development easier.

The 20 minutes build is on a specialist machine, no RBE but a lot of power. I tried to reproduce the same build on my personal laptop (see stats below, was top percentile for performance some 4-5 years ago afaik) and I gave up after almost 9 hours of compile (almost twice as it would have taken to compile the Linux kernel). I think at that point, JVM memory overhead resulted in too much slowdown to have the experiment be meaningful.

...$ cat /proc/cpuinfo      # 8 CPUs
...
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 94
model name : Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
stepping : 3
microcode : 0xe2
cpu MHz : 800.052
cache size : 6144 KB
...
...$ free -h
              total        used        free      shared  buff/cache   available
Mem:          7.2Gi       954Mi       2.8Gi       181Mi       3.4Gi       5.7Gi
Swap:         7.3Gi       1.4Gi       5.9Gi

Regarding users being affected by this, I think users that develop on multiple branches will be affected. It’s unlikely that they create all these branches at the same commit, so whenever they switch from a branch to another (given our PR review time this is frequent) they’ll have cache invalidated.

Then, there are users that are told in PR review to rebase back to master and then run additional tests. API golden generation for example has been an issue, I recall at least 3-4 PRs where someone at Google had to regenerate the goldens after manual import because external contributor could not compile the generator in reasonable time.

I think over 50% of the PRs are not running CI locally and instead only rely on the presubmits we run on Kokoro.

And then there are 2 other uses for speeding up the compile and reducing cache invalidation rate: we can add a remote cache and GitHub Actions-based presubmits for faster turnarounds and less flakiness and we can finally enable presubmits for older release branches. These are low priority/low frequency scenarios though.

1 Like

Is there a Colab notebook example of checkout and build for TF?

I tried building TF on my work laptop but could not get Bazel working due to certificate hassles.

No and I suppose that the Notebook will go in timeout like the small test with public GitHub Action to build TF inside our Docker container: