gsarti (Gabriele Sarti)

upvoted a paper 6 days ago

What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Noise-free Text-Image Corruption and Evaluation

Paper • 2406.16320 • Published 15 days ago • 1

upvoted an article 6 days ago

Article

Financial Analysis with Langchain and CrewAI Agents

By

•

9 days ago

• 4

upvoted a paper 6 days ago

The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models

Paper • 2406.19999 • Published 11 days ago • 3

upvoted 2 papers 7 days ago

Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting

Paper • 2406.00053 • Published May 28 • 1

Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

Paper • 2406.20086 • Published 11 days ago • 3

upvoted 2 papers 13 days ago

From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP

Paper • 2406.12618 • Published 21 days ago • 5

Multi-property Steering of Large Language Models with Dynamic Activation Composition

Paper • 2406.17563 • Published 14 days ago • 4

upvoted a paper 14 days ago

Confidence Regulation Neurons in Language Models

Paper • 2406.16254 • Published 15 days ago • 10

upvoted a paper 18 days ago

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

Paper • 2406.13663 • Published 20 days ago • 7

upvoted a paper 19 days ago

Estimating Knowledge in Large Language Models Without Generating a Single Token

Paper • 2406.12673 • Published 21 days ago • 7

upvoted a paper 22 days ago

Talking Heads: Understanding Inter-layer Communication in Transformer Language Models

Paper • 2406.09519 • Published 26 days ago • 1

upvoted 2 collections about 1 month ago

Modello Italia - iGenius

Collection

An unofficial collection of Italian LLMs developed by iGenius. • 2 items • Updated Jun 7 • 6

IrokoBench

Collection

a human-translated benchmark dataset for 16 African languages covering three tasks: NLI, MMLU and MGSM • 6 items • Updated May 31 • 15

upvoted 12 papers about 1 month ago

Learned feature representations are biased by complexity, learning order, position, and more

Paper • 2405.05847 • Published May 9 • 2

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

Paper • 2402.17700 • Published Feb 27 • 1

Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19 • 145

Interpretability Needs a New Paradigm

Paper • 2405.05386 • Published May 8 • 3

Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models

Paper • 2405.12522 • Published May 21 • 2

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning

Paper • 2405.12241 • Published May 17 • 1

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

Paper • 2405.10928 • Published May 17 • 1

Using Degeneracy in the Loss Landscape for Mechanistic Interpretability

Paper • 2405.10927 • Published May 17 • 3

Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control

Paper • 2405.08366 • Published May 14 • 2

upvoted 2 papers about 2 months ago

Not All Language Model Features Are Linear

Paper • 2405.14860 • Published May 23 • 39

IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation

Paper • 2203.03759 • Published Mar 7, 2022 • 5

upvoted 2 collections about 2 months ago

Wikimedia Datasets

Collection

Wikimedia datasets, across languages and modalities, from different Wikimedia projects, on the hub. Not all tested. • 19 items • Updated May 16 • 9

WebLINX Models

Collection

https://mcgill-nlp.github.io/weblinx • 17 items • Updated 11 days ago • 6

upvoted 3 papers 2 months ago

A Primer on the Inner Workings of Transformer-based Language Models

Paper • 2405.00208 • Published Apr 30 • 10

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 102

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

Paper • 2404.07129 • Published Apr 10 • 3

upvoted an article 3 months ago

Article

Vision Language Models Explained

Apr 11

• 121

upvoted a paper 3 months ago

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models

Paper • 2404.07004 • Published Apr 10 • 5

upvoted 2 collections 3 months ago

Idefics2 🐶

Collection

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 86

Pile-T5

Collection

T5 trained on the Pile with Llama Tokenizer • 4 items • Updated 3 days ago • 16

upvoted 2 papers 3 months ago

Locating and Editing Factual Associations in Mamba

Paper • 2404.03646 • Published Apr 4 • 3

Does Transformer Interpretability Transfer to RNNs?

Paper • 2404.05971 • Published Apr 9 • 3

upvoted an article 3 months ago

Article

Analysis on evaluating 7 bilions italian LLMs

By

•

Apr 10

• 2

upvoted a collection 3 months ago

Italian Sentence Transformers

Collection

11 items • Updated Feb 23 • 4

upvoted 3 papers 3 months ago

Context versus Prior Knowledge in Language Models

Paper • 2404.04633 • Published Apr 6 • 5

Do language models plan ahead for future tokens?

Paper • 2404.00859 • Published Apr 1 • 2

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 77

upvoted 2 collections 3 months ago

Zeroshot Classifiers

Collection

These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 85

A little guide to building Large Language Models in 2024

Collection

Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated Apr 1 • 14

upvoted 2 papers 3 months ago

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

Paper • 2403.19647 • Published Mar 28 • 3

Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms

Paper • 2403.17806 • Published Mar 26 • 3

upvoted 4 papers 4 months ago

DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers

Paper • 2310.03686 • Published Oct 5, 2023 • 3

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Paper • 2309.03883 • Published Sep 7, 2023 • 14

Information Flow Routes: Automatically Interpreting Language Models at Scale

Paper • 2403.00824 • Published Feb 27 • 3

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1 • 8

upvoted a collection 4 months ago

LiT5

Collection

Linguistically-Informed T5 models from the LREC-COLING paper "Linguistic Knowledge Can Enhance Encoder-Decoder Models (If You Let It)". • 6 items • Updated Feb 28 • 2

upvoted a paper 4 months ago

CausalGym: Benchmarking causal interpretability methods on linguistic tasks

Paper • 2402.12560 • Published Feb 19 • 3

upvoted 4 papers 5 months ago

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking

Paper • 2402.14811 • Published Feb 22 • 4

Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation

Paper • 2402.13331 • Published Feb 20 • 2

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

Paper • 2402.12865 • Published Feb 20 • 1

In-Context Learning Demonstration Selection via Influence Analysis

Paper • 2402.11750 • Published Feb 19 • 2

upvoted a collection 5 months ago

⛔️🔦 Provenance, Watermarking & Deepfake Detection

Collection

Technical tools for more control over non-consensual synthetic content • 14 items • Updated Apr 1 • 37

upvoted 2 papers 5 months ago

Recovering the Pre-Fine-Tuning Weights of Generative Models

Paper • 2402.10208 • Published Feb 15 • 7

SyntaxShap: Syntax-aware Explainability Method for Text Generation

Paper • 2402.09259 • Published Feb 14 • 2

Gabriele Sarti

AI & ML interests

Organizations

gsarti's activity

Financial Analysis with Langchain and CrewAI Agents

Vision Language Models Explained

Analysis on evaluating 7 bilions italian LLMs