Feynman Innovations's picture

Feynman Innovations

ajibawa-2023

·

AjinkyaBawase

AI & ML interests

LLM, RL, DL, ML, AGI. Developing LLMs (preferably fully fine tuned ) for various use cases.

Organizations

ajibawa-2023's activity

upvoted a paper 3 days ago

Were RNNs All We Needed?

Paper • 2410.01201 • Published 5 days ago • 27

upvoted a paper 6 days ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published 11 days ago • 92

upvoted a collection 6 days ago

Molmo

Artifacts for open multimodal language models. • 5 items • Updated 11 days ago • 221

upvoted a collection 18 days ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated 19 days ago • 202

upvoted a paper 24 days ago

MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications

Paper • 2409.07314 • Published 26 days ago • 50

upvoted a paper about 1 month ago

τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

Paper • 2406.12045 • Published Jun 17 • 5

upvoted a paper about 2 months ago

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 33

upvoted a collection about 2 months ago

GLiNER bi-encoders

Bi-encoder and poly-encoder architectures of GLiNER • 5 items • Updated 27 days ago • 12

upvoted 8 papers about 2 months ago

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28 • 94

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1 • 85

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7 • 54

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 85

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13 • 67

ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities

Paper • 2408.04682 • Published Aug 8 • 14

Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM

Paper • 2408.07246 • Published Aug 14 • 19

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12 • 62

upvoted an article 2 months ago

Article

Introducing TextImage Augmentation for Document Images

Aug 6

• 29

upvoted a collection 4 months ago

Granite Code Models

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated Aug 30 • 164

upvoted a paper 6 months ago

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Paper • 2402.09844 • Published Feb 15 • 20

upvoted an article 6 months ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22

• 78

upvoted a paper 6 months ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 251

upvoted 2 articles 6 months ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 161

Article

DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive

By

•

Apr 9

• 29

upvoted 4 papers 7 months ago

Uni-SMART: Universal Science Multimodal Analysis and Research Transformer

Paper • 2403.10301 • Published Mar 15 • 51

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14 • 54

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 124

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11 • 90

upvoted 3 papers 8 months ago

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 29

Lumos : Empowering Multimodal LLMs with Scene Text Recognition

Paper • 2402.08017 • Published Feb 12 • 24

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5 • 67

upvoted 2 papers 9 months ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 142

Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math

Paper • 2312.17120 • Published Dec 28, 2023 • 25

upvoted 6 papers 10 months ago

Large Language Models for Mathematicians

Paper • 2312.04556 • Published Dec 7, 2023 • 11

Beyond Surface: Probing LLaMA Across Scales and Layers

Paper • 2312.04333 • Published Dec 7, 2023 • 18

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Paper • 2312.04474 • Published Dec 7, 2023 • 29

Pearl: A Production-ready Reinforcement Learning Agent

Paper • 2312.03814 • Published Dec 6, 2023 • 14

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

Paper • 2312.04724 • Published Dec 7, 2023 • 20

SparQ Attention: Bandwidth-Efficient LLM Inference

Paper • 2312.04985 • Published Dec 8, 2023 • 38

upvoted 2 papers about 1 year ago

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 18

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82