Text feature extraction

About the demo

In this demo, you can check out the models for features extraction for the text. There are two models available - T5-Efficient-MINI (31M parameters) and T5-Efficient-TINY (16M parameters). Both models were trained on a colossal, cleaned version of Common Crawl's web crawl corpus.

Both models are available in original and quantinized variants. Models export to ONNX format was performed using fastT5 library. Exported models were quantinized using standard PyTorch functionality.

What happens under the hood: the model generates embedding vectors for both pieces of text and then calculates the cosine similarity between these vectors.

How to use the demo:
  1. Select the model and load it.
  2. Type in the first piece of text.
  3. Type in the second piece of text.
  4. Click "Calculate similarity".
  5. The output will contain the score where 1 is when the text pieces are the same and -1 is when the text pieces are completely different.
Text 1
Text 2
Output
Here will be the output