4 65 33

Daniel Huynh PRO

dhuynh95

dhuynh95

AI & ML interests

None yet

Articles

Open-source embeddings and LLMs outperform Gemini and OpenAI for Web Navigation while being faster and cheaper

Jun 21

• 6

Automatic Hallucination detection with SelfCheckGPT NLI

Nov 27, 2023

• 4

StarCoder Memorization Experiment Highlights Privacy Risks of Fine-Tuning On Code

Nov 2, 2023

Introducing BlindChat, an open-source and privacy-by-design Conversational AI fully in-browser

Sep 22, 2023

• 1

AI Total Cost of Ownership Calculator: Evaluate the cost of in-house AI deployment vs AI APIs

Sep 20, 2023

• 1

Organizations

dhuynh95's activity

upvoted a paper about 13 hours ago

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published 1 day ago • 21

upvoted a paper 2 days ago

Law of the Weakest Link: Cross Capabilities of Large Language Models

Paper • 2409.19951 • Published 5 days ago • 46

upvoted a paper 3 days ago

Can Models Learn Skill Composition from Examples?

Paper • 2409.19808 • Published 5 days ago • 6

upvoted a paper 8 days ago

Attention Prompting on Image for Large Vision-Language Models

Paper • 2409.17143 • Published 9 days ago • 5

upvoted 2 papers 16 days ago

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published 16 days ago • 35

A Controlled Study on Long Context Extension and Generalization in LLMs

Paper • 2409.12181 • Published 16 days ago • 43

upvoted a paper 24 days ago

Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published 29 days ago • 85

upvoted 3 papers about 1 month ago

MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Paper • 2408.13257 • Published Aug 23 • 25

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 110

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 40

upvoted 2 papers about 2 months ago

Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models

Paper • 2408.06663 • Published Aug 13 • 15

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 33

upvoted 4 papers 2 months ago

OmniParser for Pure Vision Based GUI Agent

Paper • 2408.00203 • Published Aug 1 • 17

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

Paper • 2407.19474 • Published Jul 28 • 22

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

Paper • 2407.20183 • Published Jul 29 • 37

AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?

Paper • 2407.15711 • Published Jul 22 • 9

upvoted 5 papers 3 months ago

InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with Inverse-Instruct

Paper • 2407.05700 • Published Jul 8 • 9

Training Task Experts through Retrieval Based Distillation

Paper • 2407.05463 • Published Jul 7 • 6

Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages

Paper • 2407.03321 • Published Jul 3 • 15

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1 • 84

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale

Paper • 2406.19280 • Published Jun 27 • 59

upvoted 4 papers 4 months ago

upvoted an article 5 months ago

Article

Improving Prompt Consistency with Structured Generations

Apr 30

• 53

upvoted a paper 5 months ago

Flamingo: a Visual Language Model for Few-Shot Learning

Paper • 2204.14198 • Published Apr 29, 2022 • 14

upvoted 19 papers 6 months ago

Compression Represents Intelligence Linearly

Paper • 2404.09937 • Published Apr 15 • 27

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 53

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

Paper • 2404.02575 • Published Apr 3 • 47

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4 • 15

Training LLMs over Neurally Compressed Text

Paper • 2404.03626 • Published Apr 4 • 21

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2 • 34

Poro 34B and the Blessing of Multilinguality

Paper • 2404.01856 • Published Apr 2 • 12

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2 • 56

BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

Paper • 2403.18421 • Published Mar 27 • 21

Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27 • 23

The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26 • 77

Can large language models explore in-context?

Paper • 2403.15371 • Published Mar 22 • 31

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

Paper • 2403.15246 • Published Mar 22 • 8

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21 • 50

When Do We Not Need Larger Vision Models?

Paper • 2403.13043 • Published Mar 19 • 25

Reverse Training to Nurse the Reversal Curse

Paper • 2403.13799 • Published Mar 20 • 13

RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15 • 66

Larimar: Large Language Models with Episodic Memory Control

Paper • 2403.11901 • Published Mar 18 • 31

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Paper • 2403.12968 • Published Mar 19 • 24

upvoted 6 papers 7 months ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 182

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

Paper • 2402.17193 • Published Feb 27 • 23

Do Large Language Models Latently Perform Multi-Hop Reasoning?

Paper • 2402.16837 • Published Feb 26 • 24

Watermarking Makes Language Models Radioactive

Paper • 2402.14904 • Published Feb 22 • 22

Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29 • 49

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22 • 82

upvoted 3 papers 8 months ago

Reformatted Alignment

Paper • 2402.12219 • Published Feb 19 • 15

FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models

Paper • 2402.10986 • Published Feb 16 • 76

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 98

upvoted a collection 8 months ago

LLM Hallucination Detection Papers

Collection

Collection of LLM hallucination and evaluation papers that I've been exploring and implementing. Some of them have my comments and annotated doodles. • 12 items • Updated Feb 20 • 12

upvoted a paper 8 months ago

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Paper • 2303.08896 • Published Mar 15, 2023 • 4

upvoted 3 papers 11 months ago

Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 28

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Paper • 2311.12022 • Published Nov 20, 2023 • 25

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 70