transformers datasets jsonlines numpy requests scikit_learn scipy sentence_transformers torch tqdm rich InstructorEmbedding