What's New in TensorFlow 2.5

After version 2.4, the Google Brain team has now released the upgraded version of TensorFlow, version 2.5.0. The latest version comes with several new and improved features. TensorFlow 2.5 now supports Python 3.9, and TensorFlow pip packages are now built with CUDA11.2 and cuDNN 8.1.0. In this article, we discuss the major updates and features of TensorFlow 2.5.0.

Major Features and Improvements

Support for Python3.9 has been added.
tf.data:
- tf.data service now supports strict round-robin reads, which is useful for synchronous training workloads where example sizes vary. With strict round robin reads, users can guarantee that consumers get similar-sized examples in the same step.
- tf.data service now supports optional compression. Previously data would always be compressed, but now you can disable compression by passing compression=None to tf.data.experimental.service.distribute(...).
- tf.data.Dataset.batch() now supports num_parallel_calls and deterministic arguments. num_parallel_calls is used to indicate that multiple input batches should be computed in parallel. With num_parallel_calls set, deterministic is used to indicate that outputs can be obtained in the non-deterministic order.
- Options returned by tf.data.Dataset.options() are no longer mutable.
- tf.data input pipelines can now be executed in debug mode, which disables any asynchrony, parallelism, or non-determinism and forces Python execution (as opposed to trace-compiled graph execution) of user-defined functions passed into transformations such as map. The debug mode can be enabled through tf.data.experimental.enable_debug_mode().
tf.lite
- Enabled the new MLIR-based quantization backend by default
- The new backend is used for 8 bits full integer post-training quantization
- The new backend removes the redundant rescales and fixes some bugs (shared weight/bias, extremely small scales, etc)
- Set experimental_new_quantizer in tf.lite.TFLiteConverter to False to disable this change
tf.keras
- tf.keras.metrics.AUC now support logit predictions.
- Enabled a new supported input type in Model.fit, tf.keras.utils.experimental.DatasetCreator, which takes a callable, dataset_fn. DatasetCreator is intended to work across all tf.distribute strategies, and is the only input type supported for Parameter Server strategy.
tf.distribute
- tf.distribute.experimental.ParameterServerStrategy now supports training with Keras Model.fit when used with DatasetCreator.
- Creating tf.random.Generator under tf.distribute.Strategy scopes is now allowed (except for tf.distribute.experimental.CentralStorageStrategy and tf.distribute.experimental.ParameterServerStrategy). Different replicas will get different random-number streams.
TPU embedding support
- Added profile_data_directory to EmbeddingConfigSpec in _tpu_estimator_embedding.py. This allows embedding lookup statistics gathered at runtime to be used in embedding layer partitioning decisions.
PluggableDevice
- Third-party devices can now connect to TensorFlow as plug-ins through StreamExecutor C API.
- and PluggableDevice interface.
- Add custom ops and kernels through kernel and op registration C API.
- Register custom graph optimization passes with graph optimization C API.
oneAPI Deep Neural Network Library (oneDNN) CPU performance optimizations from Intel-optimized TensorFlow are now available in the official x86-64 Linux and Windows builds.
- They are off by default. Enable them by setting the environment variable TF_ENABLE_ONEDNN_OPTS=1.
- We do not recommend using them in GPU systems, as they have not been sufficiently tested with GPUs yet.
TensorFlow pip packages are now built with CUDA11.2 and cuDNN 8.1.0

Breaking Changes

The TF_CPP_MIN_VLOG_LEVEL environment variable has been renamed to to TF_CPP_MAX_VLOG_LEVEL which correctly describes its effect.

Bug Fixes and Other Changes

tf.keras:
- Preprocessing layers API consistency changes:
  - StringLookup added output_mode, sparse, and pad_to_max_tokens arguments with same semantics as TextVectorization.
  - IntegerLookup added output_mode, sparse, and pad_to_max_tokens arguments with same semantics as TextVectorization. Renamed max_values, oov_value and mask_value to max_tokens, oov_token and mask_token to align with StringLookup and TextVectorization.
  - TextVectorization default for pad_to_max_tokens switched to False.
  - CategoryEncoding no longer supports adapt, IntegerLookup now supports equivalent functionality. max_tokens argument renamed to num_tokens.
  - Discretization added num_bins argument for learning bins boundaries through calling adapt on a dataset. Renamed bins argument to bin_boundaries for specifying bins without adapt.
- Improvements to model saving/loading:
  - model.load_weights now accepts paths to saved models.
- Keras inputs can now be created directly from arbitrary tf.TypeSpecs.
- Two new learning rate schedules added: tf.keras.optimizers.schedules.CosineDecay andtf.keras.optimizers.schedules.CosineDecayRestarts.
tf.data:
- Exposing tf.data.experimental.ExternalStatePolicy, which can be used to control how external state should be handled during dataset serialization or iterator checkpointing.
- Changing tf.data.experimental.save to store the type specification of the dataset elements. This avoids the need for explicitly specifying the element_spec argument of tf.data.experimental.load when loading the previously saved dataset.
- Add .element_spec property to tf.data.DatasetSpec to access the inner spec. This can be used to extract the structure of nested
- datasets.
- Add tf.data.experimental.AutoShardingPolicy.HINT which can be used to provide hints to tf.distribute-based auto-sharding as to where in the input pipeline to insert sharding transformations.
- Make tf.data.Options persistent across tf.function and GraphDef boundaries.
XLA compilation:
- tf.function(experimental_compile=True) has become a stable API, renamed tf.function(jit_compile=True).
- XLA can now compile MirroredStrategy: the step function passed tostrategy.run can now be annoted with jit_compile=True.
tf.distribute:
- Rename experimental_prefetch_to_device in tf.distribute.InputOptions to experimental_fetch_to_device to better reflect the purpose.
tf.lite:
- class tflite::Subgraph:
  - Removed the tensors() method and the non-const overload of the nodes_and_registration() method, both of which were previously documented as temporary and to be removed.
    - Uses of tensors() can be replaced by calling the existing methods tensors_size() and tensor(int).
    - Uses of the non-const overload of nodes_and_registration can be replaced by calling the existing methods nodes_size() and context(), and then calling the GetNodeAndRegistration method in the TfLiteContext returned by context().
- NNAPI
  - Removed deprecated Interpreter::UseNNAPI(bool) C++ API.
    - Use NnApiDelegate() and related delegate configuration methods directly.
  - Replaced the model cache key for models computation algorithm with one guaranteed to be stable across runs.
16 bits quantization
Added int16x8 support for ABS, REDUCE_MAX and REDUCE_MIN operators.
Additional tests and fixes for ADD and SUB operators.
Added support for saved model’s session initializer through TFLiteConverter.from_saved_model.
Added DEPTH_TO_SPACE support in Post training quantization.
Added dynamic range quantization support for the BatchMatMul op.
- Both symmetric and asymmetric quantized input tensor are supported.
Add RFFT2D as builtin op. (RFFT2D also supports RFFTD.) Currently only supports float32 input.
Add 5D support to SLICE op.
TFLite Supports SingatureDef:
- TFLiteConverter exports models with SignatureDef
- Interpreter supports getting a list of signatures and getting callable function for a given signaturedef.
Add int8 support for ReshapeV2.
Add experimental support for optimization with sparsity.
Add nominal support for unsigned 32-bit integer tensor types. Note that very few TFLite kernels support this type natively, so its use in mobile ML authoring is generally discouraged.
Add support for static hash tables through TFLiteConverter.from_saved_model.
The Python TF Lite Interpreter bindings now has an option experimental_preserve_all_tensors to aid in debugging conversion.
Quantized x86 execution defaults to Ruy GEMM library for platforms with AVX support.
Deprecate tf.compat.v1.lite.experimental.get_potentially_supported_ops. Use tf.lite.TFLiteConverter directly to check whether a model is convertible.
Add support to select one of three different built-in op resolvers
Enabled post training with calibrations for models that require user provided TensorFlow Lite custom op libraries via converter.target_spec._experimental_custom_op_registerers. used in Python Interpreter API.
TF Core:
- Corrected higher-order gradients of control flow constructs (tf.cond, tf.while_loop, and compositions like tf.foldl) computed with tf.GradientTape inside a tf.function.
- Changed the default step size in gradient_checker_v2.compute_gradients to be exactly representable as a binary floating point numbers. This avoids poluting gradient approximations needlessly, which is some cases leads to false negatives in op gradient tests.
- Added tf.config.experimental.get_memory_info, returning a dict with the current and peak memory usage. Deprecated tf.config.experimental.get_memory_usage in favor of this new function.
- Extended tf.config.experimental.enable_tensor_float_32_execution to control Tensor-Float-32 evaluation in RNNs.
- Added a ‘experimental_payloads’ field to tf.errors.OpError and its subclasses to support more detailed error reporting. This is inspired from Abseil Status payloads: https://github.com/abseil/abseil-cpp/blob/master/absl/status/status.h
tf.summary:
- New tf.summary.graph allows manual write of TensorFlow graph (tf.Graph or tf.compat.v1.GraphDef) as a summary. This is not a replacement for the trace-based API.
Set /d2ReducedOptimizeHugeFunctions by default for Windows builds. This provides a big compile-time speedup, and effectively raises the minimum supported MSVC version to 16.4 (current: 16.8).
- See: https://groups.google.com/a/tensorflow.org/d/topic/build/SsW98Eo7l3o/discussion
TensorRT
- Removed the deprecated session_config parameter for the TF1-TRT converter TrtGraphConverter. Previously, we issued a warning when the value of the parameter is not None.
- The TF2-TRT converter TrtGraphConverterV2 takes an object of class TrtConversionParams as a parameter. Removed three deprecated fields from this class: rewriter_config_template, is_dynamic_op, and max_batch_size. Previously, we issued a warning when the value of rewriter_config_template is not None. We issued an error when the value of is_dynamic_op is not True. We didn’t use the value for max_batch_size for building TensorRT engines. Add parameters use_dynamic_shape to enable dynamic shape support. The default is to disable dynamic shape support. Add dynamic_shape_profile_strategy for selecting a dynamic shape profile strategy. The default is profile strategy is Range.
- Issue a warning when function get_tensorrt_rewriter_config is used.
TF XLA
- Add new enum value MLIR_BRIDGE_ROLLOUT_SAFE_MODE_ENABLED to tf.config.experimental.mlir_bridge_rollout to enable a “safe” mode. This runs the MLIR bridge only when an analysis of the graph only when an analysis of the graph determines that it is safe to run.
- Add new enum value MLIR_BRIDGE_ROLLOUT_SAFE_MODE_FALLBACK_ENABLED' to tf.config.experimental.mlir_bridge_rollout` to enable a fallback for the MLIR bridge in a “safe” mode. This runs the MLIR bridge in a FallbackEnabled mode when an analysis of the graph determines that the graph does not have unsupported features.
Deterministic Op Functionality:
- Add determinism-unimplemented exception-throwing to the segment-sum ops. When the environment variable TF_DETERMINISTIC_OPS is set to "true" or "1" (when op-determinism is expected), an attempt to run the following ops on a GPU will throw tf.errors.UnimplementedError (with an understandable message) when data is a floating-point type, including complex types (if supported): tf.math.segment_prod, tf.math.segment_sum, tf.math.unsorted_segment_mean, tf.math.unsorted_segment_sqrt_n, tf.math.unsorted_segment_prod, tf.math.unsorted_segment_sum, and therefore also tf.convert_to_tensor when value is of type tf.IndexedSlices (such as in the back prop though tf.gather into a dense embedding). See issue 39751 which this change addresses, but does not solve. This exception-throwing behavior can be disabled by setting the environment variable TF_DISABLE_SEGMENT_REDUCTION_OP_DETERMINISM_EXCEPTIONS to "true" or "1". For more information about these changes, see the description in pull request 47772.
- In previous versions of TensorFlow, when a GPU was available, tf.sparse.sparse_dense_matmul introduced truly random noise in the forward path for data of type tf.float32 but not for data of type tf.float64 (for which there was no GPU implementation). In this current release, GPU support for other floating-point types (tf.float16, tf.float64, tf.complex64, and tf.complex128) has been added for this op. If you were relying on the determinism of the tf.float64 CPU implementation being automatically selected because of the absence of the tf.float64 GPU implementation, you with either need to force the op to run on the CPU or use a different data type.
Security
- Fixes a heap buffer overflow in RaggedBinCount (CVE-2021-29512)
- Fixes a heap out of bounds write in RaggedBinCount (CVE-2021-29514)
- Fixes a type confusion during tensor casts which leads to dereferencing null pointers (CVE-2021-29513)
- Fixes a reference binding to null pointer in MatrixDiag* ops (CVE-2021-29515)
- Fixes a null pointer dereference via invalid Ragged Tensors (CVE-2021-29516)
- Fixes a division by zero in Conv3D (CVE-2021-29517)
- Fixes vulnerabilities where session operations in eager mode lead to null pointer dereferences (CVE-2021-29518)
- Fixes a CHECK-fail in SparseCross caused by type confusion (CVE-2021-29519)
- Fixes a segfault in SparseCountSparseOutput (CVE-2021-29521)
- Fixes a heap buffer overflow in Conv3DBackprop* (CVE-2021-29520)
- Fixes a division by 0 in Conv3DBackprop* (CVE-2021-29522)
- Fixes a CHECK-fail in AddManySparseToTensorsMap (CVE-2021-29523)
- Fixes a division by 0 in Conv2DBackpropFilter (CVE-2021-29524)
- Fixes a division by 0 in Conv2DBackpropInput (CVE-2021-29525)
- Fixes a division by 0 in Conv2D (CVE-2021-29526)
- Fixes a division by 0 in QuantizedConv2D (CVE-2021-29527)
- Fixes a division by 0 in QuantizedMul (CVE-2021-29528)
- Fixes vulnerabilities caused by invalid validation in SparseMatrixSparseCholesky (CVE-2021-29530)
- Fixes a heap buffer overflow caused by rounding (CVE-2021-29529)
- Fixes a CHECK-fail in tf.raw_ops.EncodePng (CVE-2021-29531)
- Fixes a heap out of bounds read in RaggedCross (CVE-2021-29532)
- Fixes a CHECK-fail in DrawBoundingBoxes (CVE-2021-29533)
- Fixes a heap buffer overflow in QuantizedMul (CVE-2021-29535)
- Fixes a CHECK-fail in SparseConcat (CVE-2021-29534)
- Fixes a heap buffer overflow in QuantizedResizeBilinear (CVE-2021-29537)
- Fixes a heap buffer overflow in QuantizedReshape (CVE-2021-29536)
- Fixes a division by zero in Conv2DBackpropFilter (CVE-2021-29538)
- Fixes a heap buffer overflow in Conv2DBackpropFilter (CVE-2021-29540)
- Fixes a heap buffer overflow in StringNGrams (CVE-2021-29542)
- Fixes a null pointer dereference in StringNGrams (CVE-2021-29541)
- Fixes a CHECK-fail in QuantizeAndDequantizeV4Grad (CVE-2021-29544)
- Fixes a CHECK-fail in CTCGreedyDecoder (CVE-2021-29543)
- Fixes a heap buffer overflow in SparseTensorToCSRSparseMatrix (CVE-2021-29545)
- Fixes a division by 0 in QuantizedBiasAdd (CVE-2021-29546)
- Fixes a heap out of bounds in QuantizedBatchNormWithGlobalNormalization (CVE-2021-29547)
- Fixes a division by 0 in QuantizedBatchNormWithGlobalNormalization (CVE-2021-29548)
- Fixes a division by 0 in QuantizedAdd (CVE-2021-29549)
- Fixes a division by 0 in FractionalAvgPool (CVE-2021-29550)
- Fixes an OOB read in MatrixTriangularSolve (CVE-2021-29551)
- Fixes a heap OOB in QuantizeAndDequantizeV3 (CVE-2021-29553)
- Fixes a CHECK-failure in UnsortedSegmentJoin (CVE-2021-29552)
- Fixes a division by 0 in DenseCountSparseOutput (CVE-2021-29554)
- Fixes a division by 0 in FusedBatchNorm (CVE-2021-29555)
- Fixes a division by 0 in SparseMatMul (CVE-2021-29557)
- Fixes a division by 0 in Reverse (CVE-2021-29556)
- Fixes a heap buffer overflow in SparseSplit (CVE-2021-29558)
- Fixes a heap OOB access in unicode ops (CVE-2021-29559)
- Fixes a heap buffer overflow in RaggedTensorToTensor (CVE-2021-29560)
- Fixes a CHECK-fail in LoadAndRemapMatrix (CVE-2021-29561)
- Fixes a CHECK-fail in tf.raw_ops.IRFFT (CVE-2021-29562)
- Fixes a CHECK-fail in tf.raw_ops.RFFT (CVE-2021-29563)
- Fixes a null pointer dereference in EditDistance (CVE-2021-29564)
- Fixes a null pointer dereference in SparseFillEmptyRows (CVE-2021-29565)
- Fixes a heap OOB access in Dilation2DBackpropInput (CVE-2021-29566)
- Fixes a reference binding to null in ParameterizedTruncatedNormal (CVE-2021-29568)
- Fixes a set of vulnerabilities caused by lack of validation in SparseDenseCwiseMul (CVE-2021-29567)
- Fixes a heap out of bounds read in MaxPoolGradWithArgmax (CVE-2021-29570)
- Fixes a heap out of bounds read in RequantizationRange (CVE-2021-29569)
- Fixes a memory corruption in DrawBoundingBoxesV2 (CVE-2021-29571)
- Fixes a reference binding to nullptr in SdcaOptimizer (CVE-2021-29572)
- Fixes an overflow and a denial of service in tf.raw_ops.ReverseSequence (CVE-2021-29575)
- Fixes a division by 0 in MaxPoolGradWithArgmax (CVE-2021-29573)
- Fixes an undefined behavior in MaxPool3DGradGrad (CVE-2021-29574)
- Fixes a heap buffer overflow in MaxPool3DGradGrad (CVE-2021-29576)
- Fixes a heap buffer overflow in AvgPool3DGrad (CVE-2021-29577)
- Fixes an undefined behavior and a CHECK-fail in FractionalMaxPoolGrad (CVE-2021-29580)
- Fixes a heap buffer overflow in FractionalAvgPoolGrad (CVE-2021-29578)
- Fixes a heap buffer overflow in MaxPoolGrad (CVE-2021-29579)
- Fixes a segfault in CTCBeamSearchDecoder (CVE-2021-29581)
- Fixes a heap OOB read in tf.raw_ops.Dequantize (CVE-2021-29582)
- Fixes a CHECK-fail due to integer overflow (CVE-2021-29584)
- Fixes a heap buffer overflow and undefined behavior in FusedBatchNorm (CVE-2021-29583)
- Fixes a division by zero in padding computation in TFLite (CVE-2021-29585)
- Fixes a division by zero in optimized pooling implementations in TFLite (CVE-2021-29586)
- Fixes a division by zero in TFLite’s implementation of SpaceToDepth (CVE-2021-29587)
- Fixes a division by zero in TFLite’s implementation of GatherNd (CVE-2021-29589)
- Fixes a division by zero in TFLite’s implementation of TransposeConv (CVE-2021-29588)
- Fixes a heap OOB read in TFLite’s implementation of Minimum or Maximum (CVE-2021-29590)
- Fixes a null pointer dereference in TFLite’s Reshape operator (CVE-2021-29592)
- Fixes a stack overflow due to looping TFLite subgraph (CVE-2021-29591)
- Fixes a division by zero in TFLite’s implementation of DepthToSpace (CVE-2021-29595)
- Fixes a division by zero in TFLite’s convolution code (CVE-2021-29594)
- Fixes a division by zero in TFLite’s implementation of EmbeddingLookup (CVE-2021-29596)
- Fixes a division by zero in TFLite’s implementation of BatchToSpaceNd (CVE-2021-29593)
- Fixes a division by zero in TFLite’s implementation of SpaceToBatchNd (CVE-2021-29597)
- Fixes a division by zero in TFLite’s implementation of SVDF (CVE-2021-29598)
- Fixes a division by zero in TFLite’s implementation of Split (CVE-2021-29599)
- Fixes a division by zero in TFLite’s implementation of OneHot (CVE-2021-29600)
- Fixes a division by zero in TFLite’s implementation of DepthwiseConv (CVE-2021-29602)
- Fixes a division by zero in TFLite’s implementation of hashtable lookup (CVE-2021-29604)
- Fixes a integer overflow in TFLite concatentation (CVE-2021-29601)
- Fixes a integer overflow in TFLite memory allocation (CVE-2021-29605)
- Fixes a heap OOB write in TFLite (CVE-2021-29603)
- Fixes a heap OOB read in TFLite (CVE-2021-29606)
- Fixes a heap OOB and null pointer dereference in RaggedTensorToTensor (CVE-2021-29608)
- Fixes vulnerabilities caused by incomplete validation in SparseAdd (CVE-2021-29609)
- Fixes vulnerabilities caused by incomplete validation in SparseSparseMinimum (CVE-2021-29607)
- Fixes vulnerabilities caused by incomplete validation in SparseReshape (CVE-2021-29611)
- Fixes vulnerabilities caused by invalid validation in QuantizeAndDequantizeV2 (CVE-2021-29610)
- Fixes a heap buffer overflow in BandedTriangularSolve (CVE-2021-29612)
- Fixes vulnerabilities caused by incomplete validation in tf.raw_ops.CTCLoss (CVE-2021-29613)
- Fixes an interpreter crash from vulnerabilities in tf.io.decode_raw (CVE-2021-29614)
- Fixes a stack overflow in ParseAttrValue with nested tensors (CVE-2021-29615)
- Fixes a null dereference in Grappler’s TrySimplify (CVE-2021-29616)
- Fixes a crash in tf.transpose with complex inputs (CVE-2021-29618)
- Fixes a crash in tf.strings.substr due to CHECK-fail (CVE-2021-29617)
- Fixes a segfault in tf.raw_ops.SparseCountSparseOutput (CVE-2021-29619)
- Fixes a segfault in tf.raw_ops.ImmutableConst (CVE-2021-29539)
- Updates curl to 7.76.0 to handle CVE-2020-8169, CVE-2020-8177, CVE-2020-8231, CVE-2020-8284, CVE-2020-8285 and CVE-2020-8286.
Other
- Added show_debug_info to mlir.convert_graph_def and mlir.convert_function.
- Added Arm Compute Library (ACL) support to --config=mkl_aarch64 build.

#tensorflow #deep-learning #data-science #python

Major Features and Improvements

Breaking Changes

Bug Fixes and Other Changes

github.com

What's New in TensorFlow 2.5