MobileNetV4 pretrained weights Collection Weights for MobileNet-V4 pretrained in timm • 13 items • Updated 14 days ago • 9
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels Paper • 2406.09415 • Published 26 days ago • 47
The best SDXL models Collection The best diffusion models (checkpoints) based on SDXL • 7 items • Updated 29 days ago • 4
The best SD1.5 models Collection The best diffusion models (checkpoints) based on SD1.5 • 6 items • Updated Jun 6 • 3
An Image is Worth 32 Tokens for Reconstruction and Generation Paper • 2406.07550 • Published 28 days ago • 53
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper • 2405.19327 • Published May 29 • 43
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28 • 116
view article Article Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages May 24 • 20
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • 16 days ago • 48
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated May 31 • 362
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare Apr 19 • 82
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies Paper • 2404.08197 • Published Apr 12 • 26
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text Paper • 2403.18421 • Published Mar 27 • 21
FeatUp: A Model-Agnostic Framework for Features at Any Resolution Paper • 2403.10516 • Published Mar 15 • 16
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training? Paper • 2402.01832 • Published Feb 2 • 4
Synth^2: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings Paper • 2403.07750 • Published Mar 12 • 19
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper • 2402.17177 • Published Feb 27 • 88
Sora Reference Papers Collection A collection of all papers referenced in OpenAI's "Video generation models as world simulators" technical report • openai.com/sora • 30 items • Updated Feb 20 • 51
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset Paper • 2402.05937 • Published Feb 8 • 8
CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation Paper • 2401.12208 • Published Jan 22 • 20
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16 • 35
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 132
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs Paper • 2311.13600 • Published Nov 22, 2023 • 41
Towards Accurate Differential Diagnosis with Large Language Models Paper • 2312.00164 • Published Nov 30, 2023 • 8
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Paper • 2311.16079 • Published Nov 27, 2023 • 19
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 44
Nougat: Neural Optical Understanding for Academic Documents Paper • 2308.13418 • Published Aug 25, 2023 • 33
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? Paper • 2305.07759 • Published May 12, 2023 • 30
Dynamic-Resolution Model Learning for Object Pile Manipulation Paper • 2306.16700 • Published Jun 29, 2023 • 5