Language model assistance

For the mere inference, running a CLIP model with ONNX is interesting

Otherwise, with an API model: