fdsqefsgergd's picture

1821 116

fdsqefsgergd

T-representer

·

AI & ML interests

None yet

Organizations

None yet

T-representer's activity

upvoted 18 papers about 12 hours ago

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16 • 29

Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16 • 20

Towards A Better Metric for Text-to-Video Generation

Paper • 2401.07781 • Published Jan 15 • 14

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11 • 39

TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10 • 63

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Paper • 2401.05566 • Published Jan 10 • 25

Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models

Paper • 2401.06102 • Published Jan 11 • 19

Towards Conversational Diagnostic AI

Paper • 2401.05654 • Published Jan 11 • 14

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

Paper • 2401.05033 • Published Jan 10 • 15

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Paper • 2401.02994 • Published Jan 4 • 46

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8 • 69

Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

Paper • 2401.03462 • Published Jan 7 • 26

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

Paper • 2401.04092 • Published Jan 8 • 20

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5 • 39

Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

Paper • 2401.02669 • Published Jan 5 • 14

Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

Paper • 2401.02117 • Published Jan 4 • 26

LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4 • 36

LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4 • 14

upvoted 12 papers 2 days ago

GPT-4V(ision) is a Generalist Web Agent, if Grounded

Paper • 2401.01614 • Published Jan 3 • 20

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

Paper • 2401.01827 • Published Jan 3 • 14

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Paper • 2401.01854 • Published Jan 3 • 10

CoMoSVC: Consistency Model-based Singing Voice Conversion

Paper • 2401.01792 • Published Jan 3 • 8

LLaMA Beyond English: An Empirical Study on Language Capability Transfer

Paper • 2401.01055 • Published Jan 2 • 52

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 62

LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2 • 26

VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM

Paper • 2401.01256 • Published Jan 2 • 19

Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

Paper • 2401.01117 • Published Jan 2 • 8

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws

Paper • 2401.00448 • Published Dec 31, 2023 • 27

Boosting Large Language Model for Speech Synthesis: An Empirical Study

Paper • 2401.00246 • Published Dec 30, 2023 • 9

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Paper • 2312.17681 • Published Dec 29, 2023 • 18

upvoted 14 papers 3 days ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published 4 days ago • 37

Scaling Retrieval-Based Language Models with a Trillion-Token Datastore

Paper • 2407.12854 • Published 13 days ago • 26

Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion

Paper • 2407.13759 • Published 4 days ago • 12

Understanding Reference Policies in Direct Preference Optimization

Paper • 2407.13709 • Published 4 days ago • 11

Shape of Motion: 4D Reconstruction from a Single Video

Paper • 2407.13764 • Published 3 days ago • 14

Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections

Paper • 2407.12306 • Published 5 days ago • 5

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

Paper • 2407.12327 • Published 5 days ago • 61

AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases

Paper • 2407.12784 • Published 4 days ago • 43

E5-V: Universal Embeddings with Multimodal Large Language Models

Paper • 2407.12580 • Published 5 days ago • 31

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control

Paper • 2407.12781 • Published 4 days ago • 10

IMAGDressing-v1: Customizable Virtual Dressing

Paper • 2407.12705 • Published 5 days ago • 6

Audio Conditioning for Music Generation via Discrete Bottleneck Features

Paper • 2407.12563 • Published 5 days ago • 5

AUITestAgent: Automatic Requirements Oriented GUI Function Testing

Paper • 2407.09018 • Published 10 days ago • 4

QuIP: 2-Bit Quantization of Large Language Models With Guarantees

Paper • 2307.13304 • Published Jul 25, 2023 • 2

upvoted 5 papers 4 days ago

Prompt Expansion for Adaptive Text-to-Image Generation

Paper • 2312.16720 • Published Dec 27, 2023 • 5

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

Paper • 2312.16272 • Published Dec 26, 2023 • 6

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision

Paper • 2312.16256 • Published Dec 26, 2023 • 15

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Paper • 2312.16862 • Published Dec 28, 2023 • 29

DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation

Paper • 2407.11394 • Published 6 days ago • 10

upvoted 11 papers 5 days ago

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes

Paper • 2407.10957 • Published 7 days ago • 23

Efficient Training with Denoised Neural Weights

Paper • 2407.11966 • Published 5 days ago • 7

Animate3D: Animating Any 3D Model with Multi-view Video Diffusion

Paper • 2407.11398 • Published 6 days ago • 7

OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces

Paper • 2407.11895 • Published 6 days ago • 7

From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients

Paper • 2407.11239 • Published 6 days ago • 5

Scaling Diffusion Transformers to 16 Billion Parameters

Paper • 2407.11633 • Published 6 days ago • 21

Qwen2-Audio Technical Report

Paper • 2407.10759 • Published 7 days ago • 29

Exploiting Novel GPT-4 APIs

Paper • 2312.14302 • Published Dec 21, 2023 • 12

ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors

Paper • 2312.13324 • Published Dec 20, 2023 • 9

DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 24

Distilling System 2 into System 1

Paper • 2407.06023 • Published 14 days ago • 2