TensorFlow Linux wheels are being upgraded to manylinux2014

Hello everyone,

The TensorFlow OSS DevInfra Team is planning to upgrade the Linux wheels of TensorFlow to be manylinux2014 compatible. This has been in the works for a while and we are happy to announce that from Monday, March 21, we are going to start publishing manylinux2014 compatible TensorFlow nightly packages ahead of the next TensorFlow release, TensorFlow 2.9, which has its branch cut scheduled on March 29th, 2022 (tentative). As part of this upgrade, we are also shifting to using the new libstdcxx ABI which was discussed to be implemented as part of the manylinux upgrade in the March 2022 edition of TF Sig Build meeting.

Also on the 21st, the TF SIG Build Dockerfiles will change to use the new toolchain by default. As long as everything goes smoothly, the first manylinux2014 tf-nightly packages should arrive on Tuesday, March 22. If you’d like to help us test manylinux2014 packages before then, please use the links below.

If you would like to test the manylinux2014 build environment (pending PR), please see the instructions here.

If you would like to start testing the manylinux2014 packages for advanced comparison, here is a set of manylinux2010 and manylinux2014 packages we built at the same commit (9d98dc772).

FAQs:

Q1. I’m a downstream developer. How should I change my build process to be compatible with new TF wheels?

The ABI change is not compatible with the old (manylinux2010) wheels. To be compatible with the new TF wheels, please follow the instructions below:

  1. If you use TensorFlow’s toolchain/crosstool, upgrade to the new manylinux2014 crosstool. See the .bazelrc here.
  2. If your build contains the flag --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0, change the 0 to 1. The new toolchain effectively sets this to 1 by default if it is not explicitly set.

Aside from the ABI flag change, the toolchain upgrade by itself (from devtoolset-7 to devtoolset-9) is not likely to cause breakages if your package does not already build with TensorFlow’s toolchains. If you do use TensorFlow’s toolchains, you should upgrade to the new manylinux2014 crosstool.

Q2. What kinds of breakages during the build process are most likely related to these changes?

  1. Linker errors / Undefined reference errors usually involving __cxx11 symbols
  2. RuntimeError: random_device could not be read

Thank you!

7 Likes

FYI @bhack @seanpmorgan

Thanks very much @angerson and team!

The wheels for tf-nightly and tf-nightly-gpu as of version dev20220322 (today, March 22) are now manylinux2014-compliant (a.k.a. manylinux_2_17) and have been built with --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=1. We also started building libtensorflow nightlies with the new toolchains, switched TensorFlow’s bazelrc to use the new toolchains for our release build configurations, and updated the SIG Build Dockerfiles.

Please try out the new wheels, the new build toolchains, and the upgraded Dockerfiles to build TensorFlow, and let us know in this thread if you run into any new kinds of problems.

3 Likes

Hi
Please note that setting D_GLIBCXX_USE_CXX11_ABI=1 is ignored by the toolchain offered by the official manylinux2014 docker. This, apparently, has to do with a bug in Centos7. Nevertheless, as stated by PEP599 images produced with manylinux2014 docker get manylinux2014 tag. So switching to c++11 ABI actually seems to make tf non-compliant with manylinux2014.

This is a problem for downstream projects that build using official pypi docker images.

Please comment

@perfinion Any thoughts on this?

If I understand the situation correctly, this only affects packages that talk to TF at the C++ level. We haven’t heard from any such groups yet, and I don’t know how many are likely to be affected. How is this affecting you? The easy solution is to use the SIG Build images linked above, which many downstream teams already use.

We are exactly in this situation: linking with TF at C++ level and using pypi dockers for building. Currently investigating two solutions: migrating to SIG build images or migrating to manylinux2_24. ATM both seem to work.
Fortunately neither do we link against any other pypi packages nor do other pypi packages link against us. I.e. we are not affected by incompatibilities with the broader pypi ecosystem so we’ll just follow TF ABI switch.

@Pawel_Piskorski What downstream project in this case? Sorry if I you mentioned and I missed it. Do have some links to your release workflow/scripts?

I’ve heard of that centos bug but there isnt really any good solution in general to the whole CXX11_ABI=1 thing. All linux distros did full rebuilds many years ago but python is in a weird position right now. None of the manylinux specs mention CXX11_ABI at all :frowning: . CXX11_ABI=0/1 is orthogonal to manylinux for the most part and we definitely need to move to =1 eventually so doing the CXX11_ABI=1 switch together all at once seemed the least-painful option :confused: .

As for moving to new images, both our SIG-Build or the manylinux2_24 containers are reasonable, the biggest difference most likely would come down to the SIG-Build ones have GPU stuff ready to go but not sure if that matters to your project.

Again, sorry for the trouble caused, but hopefully things will be better once we get this transition over with :smiley:

1 Like

@perfinion thanks, but I can’t disclose the project and the release scripts are not public anyway.
Agree that taking that leap is the best way forward. c++11 is over a decade old so, yeah, high time :slight_smile: