requests beautifulsoup4 transformers torch huggingface_hub sentencepiece pymupdf nltk PyPDF2 tiktoken langchain-core langchain langchain-community chromadb openpyxl nltk pypdf spacy sentence-transformers faiss-cpu scikit-learn feedparser