Error while using the model for zero-shot-audio-classification

#4
by mboushaba - opened

Hii,
Im new here, im trying to use this model with task "zero-shot-audio-classification" as shown in the example ( using pipline), but i get this error :

"Pipeline cannot infer suitable model classes from laion/clap-htsat-unfused".

this is the code i use :

audio_classifier = pipeline(task="zero-shot-audio-classification", model="laion/clap-htsat-unfused")

When i try to use the model with "feature-extraction" task , i got a longer error :

"raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.clap.configuration_clap.ClapConfig'> for this kind of AutoModel: TFAutoModel.
Model type should be one of AlbertConfig, BartConfig, BertConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, CamembertConfig, CLIPConfig, ConvBertConfig, ConvNextConfig, ConvNextV2Config, CTRLConfig, CvtConfig, Data2VecVisionConfig, DebertaConfig, DebertaV2Config, DeiTConfig, DistilBertConfig, DPRConfig, EfficientFormerConfig, ElectraConfig, EsmConfig, FlaubertConfig, FunnelConfig, GPT2Config, GPT2Config, GPTJConfig, GroupViTConfig, HubertConfig, IdeficsConfig, LayoutLMConfig, LayoutLMv3Config, LEDConfig, LongformerConfig, LxmertConfig, MarianConfig, MBartConfig, MistralConfig, MobileBertConfig, MobileViTConfig, MPNetConfig, MT5Config, OpenAIGPTConfig, OPTConfig, PegasusConfig, RegNetConfig, RemBertConfig, ResNetConfig, RobertaConfig, RobertaPreLayerNormConfig, RoFormerConfig, SamConfig, SegformerConfig, Speech2TextConfig, SwiftFormerConfig, SwinConfig, T5Config, TapasConfig, TransfoXLConfig, VisionTextDualEncoderConfig, ViTConfig, ViTMAEConfig, Wav2Vec2Config, WhisperConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLNetConfig.

"

Can you help me out please :)
Thank you

Sign up or log in to comment