Malikeh Ehghaghi's picture

Malikeh Ehghaghi

Malikeh1375

·

Malikeh5

AI & ML interests

Machine Learning for Healthcare, Responsible AI, Multimodal Learning, Natural Language Processing

Organizations

Malikeh1375's activity

upvoted a collection 11 days ago

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 13 items • Updated 12 days ago • 45

upvoted a paper 11 days ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 17 days ago • 128

upvoted a collection 18 days ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated 18 days ago • 226

upvoted a collection 20 days ago

Korean Datasets I've released so far.

지금까지 업로드한 한국어 데이터셋 콜렉션입니다. • 8 items • Updated May 24 • 16

upvoted a collection about 1 month ago

Arabic Light Benchmarks

10% sample of the original benchmarks for each dataset from lighteval • 7 items • Updated 26 days ago • 2

upvoted an article about 2 months ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

By

•

Aug 19

• 72

upvoted a collection about 2 months ago

Arabic ORPO-DPO Datasets

12 items • Updated Aug 17 • 2

upvoted 3 papers about 2 months ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3 • 43

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 154

BOND: Aligning LLMs with Best-of-N Distillation

Paper • 2407.14622 • Published Jul 19 • 17

upvoted 2 collections about 2 months ago

Top 10% instruction tuning datasets

Collects datasets with 'instruction' in the name and more than 1 download and in the top 10% for the number of likes • 13 items • Updated Jul 3 • 7

Probably function calling datasets

Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17 • 35

upvoted 5 papers about 2 months ago

Better Alignment with Instruction Back-and-Forth Translation

Paper • 2408.04614 • Published Aug 8 • 14

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

Paper • 2408.03361 • Published Aug 6 • 85

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8 • 154

Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew

Paper • 2309.14568 • Published Sep 25, 2023 • 4

Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities

Paper • 2407.07080 • Published Jul 9 • 21

upvoted a paper 2 months ago

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Paper • 2408.02900 • Published Aug 6 • 25

upvoted an article 2 months ago

Article

Introducing the Open Leaderboard for Hebrew LLMs!

May 5

• 32

upvoted a paper 3 months ago

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Paper • 2309.03883 • Published Sep 7, 2023 • 33

upvoted a paper 4 months ago

MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

Paper • 2308.14089 • Published Aug 27, 2023 • 28

upvoted an article 4 months ago

Article

Introducing the Open Arabic LLM Leaderboard

May 14

• 64

upvoted a collection 6 months ago

Llama 3 Merges

Here is a collection of merged models based on Llama-3 variants to showcase the seamless compatibility of MergeKit with Llama-3 architecture. • 6 items • Updated 16 days ago • 4

upvoted a paper 7 months ago

Arcee's MergeKit: A Toolkit for Merging Large Language Models

Paper • 2403.13257 • Published Mar 20 • 20

upvoted a collection 7 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 212