ytu-ce-cosmos
/

Turkish-LLaVA-v0.1

Image-Text-to-Text

Model card Files Files and versions Community

Turkish-LLaVA-v0.1 / README.md

erdem-erdem's picture

Create README.md

78c910e verified about 1 month ago

|

No virus

2.43 kB

	---
	license: mit
	language:
	- tr
	pipeline_tag: image-text-to-text
	tags:
	- Turkish
	- turkish
	- LLaVA
	datasets:
	- liuhaotian/LLaVA-CC3M-Pretrain-595K
	---

	<img src="./CosmosLLaVA.png"/>

	# Llava-CosmosLlama

	This is a Turkish visual language model designed for multi-modal visual instruction-following tasks. It utilizes the LLaVA (Large Language and Vision Assistant) architecture, integrating the `ytucosmos/Turkish-Llama-8b-Instruct-v0.1` language model. The model is capable of processing both visual (image) and textual inputs, allowing it to understand and execute instructions provided in Turkish.

	# Model Details
	The model was pretrained with a translated version of the [LLaVA-CC3M-Pretrain-595K](https://huggingface.co/datasets/liuhaotian/LLaVA-CC3M-Pretrain-595K) dataset.<br>
	It was further fine-tuned using subsets the following datasets to enhance its visual reasoning and understanding capabilities:
	- [Stanford GQA](https://cs.stanford.edu/people/dorarad/gqa/about.html)
	- [VisualGenome](https://homes.cs.washington.edu/~ranjay/visualgenome/index.html)
	- [COCO](https://cocodataset.org/#home)

	## Example Usage
	```python
	from lmdeploy import pipeline, ChatTemplateConfig
	from lmdeploy.vl import load_image

	pipe = pipeline("ytu-ce-cosmos/model_name",
	chat_template_config=ChatTemplateConfig(model_name='llama3'))

	url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/idefics-im-captioning.jpg"
	image = load_image(url)

	response = pipe(('Bu resimde öne çıkan ögeler nelerdir?', image))

	print(response)
	"""
	Resimde, çiçeklerle dolu bir bahçede yavru bir köpek ve arka planda bir ağaç yer alıyor. Köpek, çiçeklerin arasında otururken ve etrafını saran çiçeklerin arasından bakarken görülebiliyor. Bu sahne, köpeğin bahçede geçirdiği zamanın tadını çıkardığı ve çevresini keşfettiği sakin ve huzurlu bir atmosferi yansıtıyor.
	"""
	```

	# Acknowledgments
	- Computing resources used in this work were provided by the National Center for High Performance Computing of Turkey (UHeM).
	- Thanks to the generous support from the Hugging Face team, it is possible to download models from their S3 storage 🤗


	# Citation
	```bibtex
	...
	```

	### Contact
	COSMOS AI Research Group, Yildiz Technical University Computer Engineering Department <br>
	https://cosmos.yildiz.edu.tr/ <br>
	cosmos@yildiz.edu.tr <br>