mohammedbriman (Mohammed Brıman)

upvoted a paper 2 days ago

Were RNNs All We Needed?

Paper • 2410.01201 • Published 5 days ago • 26

upvoted 2 papers 10 days ago

Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely

Paper • 2409.14924 • Published 13 days ago • 1

Automatic Metrics in Natural Language Generation: A Survey of Current Evaluation Practices

Paper • 2408.09169 • Published Aug 17 • 1

upvoted a paper 11 days ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published 11 days ago • 92

upvoted a paper 18 days ago

One missing piece in Vision and Language: A Survey on Comics Understanding

Paper • 2409.09502 • Published 22 days ago • 23

upvoted 2 papers 19 days ago

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published 19 days ago • 67

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published 20 days ago • 33

upvoted a paper 25 days ago

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture

Paper • 2409.02889 • Published Sep 4 • 54

upvoted 2 papers 26 days ago

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery

Paper • 2409.05591 • Published 27 days ago • 26

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

Paper • 2409.05840 • Published 27 days ago • 45

upvoted 2 papers about 1 month ago

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Paper • 2409.02813 • Published Sep 4 • 27

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3 • 77

upvoted a collection about 1 month ago

Papers I want to read

Collection

Papers in my to-read list • 240 items • Updated about 3 hours ago • 21

upvoted 2 papers about 1 month ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27 • 138

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 111

upvoted a collection about 2 months ago

Jamba-1.5

Collection

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 75

upvoted 3 papers about 2 months ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 40

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15 • 51

Banishing LLM Hallucinations Requires Rethinking Generalization

Paper • 2406.17642 • Published Jun 25 • 1

upvoted a collection about 2 months ago

AI Paper of the Day

Collection

A collection of papers that I think are interesting, one added each day • 182 items • Updated 1 day ago • 24

upvoted a paper about 2 months ago

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8 • 154

upvoted 3 papers 2 months ago

upvoted an article 2 months ago

Article

Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 👐 📚

By

•

Jul 10

• 32

upvoted 3 papers 3 months ago

On the Limitations of Compute Thresholds as a Governance Strategy

Paper • 2407.05694 • Published Jul 8 • 2

NNsight and NDIF: Democratizing Access to Foundation Model Internals

Paper • 2407.14561 • Published Jul 18 • 34

Knowledge Mechanisms in Large Language Models: A Survey and Perspective

Paper • 2407.15017 • Published Jul 22 • 33

upvoted a collection 3 months ago

🪐 SmolLM

Collection

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 174

upvoted an article 3 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 244

upvoted 10 papers 3 months ago

The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism

Paper • 2407.10457 • Published Jul 15 • 22

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 154

FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation

Paper • 2407.07093 • Published Jul 9 • 1

Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 35

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Paper • 2407.04620 • Published Jul 5 • 27

How Does Quantization Affect Multilingual LLMs?

Paper • 2407.03211 • Published Jul 3 • 1

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

Paper • 2406.16838 • Published Jun 24 • 2

Preference Tuning For Toxicity Mitigation Generalizes Across Languages

Paper • 2406.16235 • Published Jun 23 • 12

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21 • 60

upvoted 12 papers 4 months ago

mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus

Paper • 2406.08707 • Published Jun 13 • 15

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Paper • 2406.07522 • Published Jun 11 • 36

Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13 • 43

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Paper • 2406.08973 • Published Jun 13 • 85

Scalable MatMul-free Language Modeling

Paper • 2406.02528 • Published Jun 4 • 8

σ-GPTs: A New Approach to Autoregressive Models

Paper • 2404.09562 • Published Apr 15 • 4

Dense Connector for MLLMs

Paper • 2405.13800 • Published May 22 • 21

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 85

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Paper • 2404.19752 • Published Apr 30 • 22

Aya 23: Open Weight Releases to Further Multilingual Progress

Paper • 2405.15032 • Published May 23 • 26

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

Paper • 2405.15738 • Published May 24 • 43

Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24 • 53

upvoted 6 papers 5 months ago

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21 • 28

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 45

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15 • 87

SUTRA: Scalable Multilingual Language Model Architecture

Paper • 2405.06694 • Published May 7 • 37

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3 • 98

You Only Cache Once: Decoder-Decoder Architectures for Language Models

Paper • 2405.05254 • Published May 8 • 8

upvoted an article 5 months ago

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

By

•

Jun 23

• 33

upvoted a paper 5 months ago

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

Mohammed Brıman

AI & ML interests

Organizations

mohammedbriman's activity

Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 👐 📚

SmolLM - blazingly fast and remarkably powerful

SeeMoE: Implementing a MoE Vision Language Model from Scratch