Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: StableHLO as a front end for HEIR #738

Open
asraa opened this issue Jun 17, 2024 · 2 comments
Open

feat: StableHLO as a front end for HEIR #738

asraa opened this issue Jun 17, 2024 · 2 comments
Assignees

Comments

@asraa
Copy link
Collaborator

asraa commented Jun 17, 2024

I've recently learned about StableHLO, and been pretty convinced that it should be a frontend to HEIR:

  1. StableHLO is an open-source first project aiming to be a standard inside and
    outside of Google. It aims to be a portability layer between ML frameworks
    and ML compilers and is currently used by TensorFlow, JAX, PyTorch and XLA,
    IREE, and more.

  2. StableHLO has lowerings to standard MLIR without the use of bufferization,
    preserving the original tensors. This would create a pathway for high-level
    ML programs to RLWE schemes that utilize types and passes based on tensor
    types.

  3. This would also enable a frontend for quantized PyTorch models (for example,
    Zama takes QAT PyTorch and ingests them into concrete-ml through the ONNX
    format).

  4. Support for quantizing TensorFlow models using StableHLO quantization

  5. Support for qKeras models through a qKeras compilation to HLO. HLO has
    parity guarantees with StableHLO. qKeras offers full integer quantization,
    so would enable us to quantize models more efficiently than the existing use
    of TensorFlowLite quantization.

Will this replace TOSA?

Probably not, and it would be nice to keep both representations.

StableHLO to standard MLIR

I have some internal code that lowers stableHLO to standard MLIR (using, func, affine loops, tensor, arith - notably not memref). Some of it uses passes with tensorflow's XLA compiler right now, so I'll attach a PR with the added dep, and perhaps create a standalone tool depending on feedback.

@asraa
Copy link
Collaborator Author

asraa commented Jun 17, 2024

The internal code that runs the stableHLO to standard MLIR lowering does:

  1. stablehlo-legalize-to-hlo
  2. Create XLA HLO module
  3. Partition computation using XLA utils
  4. Convert subgraphs to MLIR functions using standard MLIR <- this is where the linalg to affine loops with tensor lowerings exist (internally)
  5. Simplify some bound checks and canonicalize.

There also exists stable-hlo-legalize-to-tosa, although I don't know if it will be fully supported. The downside here is our tosa lowering path requires bufferization.

Then there is also stablehlo-legalize-to-linalg.

IREE also has a number of passes...

@johnmatter
Copy link

johnmatter commented Jun 18, 2024

Hi Asra. OpenXLA has some good tutorials for lowering Jax/pytorch/tensorflow to StableHLO.

For anyone who prefers video to text, this video from April 2024 covers the same information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants