Tried sadtalker , too much time consumption. D-ID is proprietary . Looking something from opensource. Tried wav2lip and also enhancing that with GFPGAN , output is good but i want something fast.
Akhil B
hakunamatata1997
AI & ML interests
Gen AI , NLP , Computer Vision , XAI
Organizations
hakunamatata1997's activity
replied to
their
post
4 months ago
replied to
their
post
5 months ago
Yeah tried QwenVL , it's poor on understanding text, QwenVL-Plus and Max are good but not open sourced ๐ช
replied to
their
post
5 months ago
@merve more particularly if i say, something like understanding text good enough in images so the response are accurate enough from VLM
replied to
their
post
5 months ago
Did anyone research on frameworks or tools that are currently being used to make agents for production. I've been doing some research but most of them not suitable for production.