torch transformers huggingface_hub gdown pymupdf unidecode pdf2image chardet python-dateutil datasets underthesea accelerate pytorch-crf==0.7.2 sklearn-crfsuite scikit-learn numpy pandas install-jdk seaborn