Multimodal Data: AirIO Simplifying Data Handling

AirIO is a library for loading, processing and feeding multimodal data into sequence models. It provides simple APIs to write reusable specifications encapsulating data loading and transformation steps in training, inference and evaluation. AirIO supports a variety of storage formats, e.g. SSTable, and services, e.g. TFDS, and a variety of data loaders, e.g. Grain and tf.data. It is fully compatible with frameworks such as Jax and TensorFlow.

The following are guiding principles for AirIO development:

  • Clear abstractions
    • Agnostic encapsulation over data loading and processing steps
    • Compatible with Grain, tf.data, etc.
  • Clear interfaces with other components
    • Clear boundary with evaluation libraries
    • Ability to combine a variety of data formats
    • Simple bridges to smooth decoupling
  • Verifiable data pipelines
    • Plug in inspection and visualization tools
    • Easy path to setting up tests
  • Good software design patterns
    • No global state
    • Composition over inheritance
    • Loose coupling with data, eval, and other API layers

Installation

From source

git clone https://github.com/google/airio.git
cd airio
pip install -e .

Download Details:

Author: google

Official Github: https://github.com/google/airio 

License: Apache-2.0 license

#typescript #python 

Multimodal Data: AirIO Simplifying Data Handling
1.35 GEEK