stefan-it (Stefan)

upvoted a paper 11 days ago

Learn it or Leave it: Module Composition and Pruning for Continual Learning

Paper • 2406.18708 • Published 12 days ago • 1

upvoted a paper 13 days ago

Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation

Paper • 2406.16678 • Published 15 days ago • 12

upvoted a paper 27 days ago

AGB-DE: A Corpus for the Automated Legal Assessment of Clauses in German Consumer Contracts

Paper • 2406.06809 • Published 28 days ago • 1

upvoted 2 papers about 1 month ago

NoiseBench: Benchmarking the Impact of Real Label Noise on Named Entity Recognition

Paper • 2405.07609 • Published May 13 • 1

Zyda: A 1.3T Dataset for Open Language Modeling

Paper • 2406.01981 • Published Jun 4 • 2

upvoted an article about 1 month ago

Article

Announcing Occiglot-Fineweb

By

•

Jun 4

• 5

upvoted 2 papers about 1 month ago

Joint Lemmatization and Morphological Tagging with LEMMING

Paper • 2405.18308 • Published May 28 • 1

GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction

Paper • 2405.15760 • Published May 24 • 1

upvoted 3 papers about 2 months ago

upvoted a paper 2 months ago

HistNERo: Historical Named Entity Recognition for the Romanian Language

Paper • 2405.00155 • Published Apr 30 • 4

upvoted 13 papers 3 months ago

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

Paper • 2404.14408 • Published Apr 22 • 6

Investigating Gender Bias in Turkish Language Models

Paper • 2404.11726 • Published Apr 17 • 1

Fewer Truncations Improve Language Modeling

Paper • 2404.10830 • Published Apr 16 • 2

Token Dropping for Efficient BERT Pretraining

Paper • 2203.13240 • Published Mar 24, 2022 • 2

Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset

Paper • 2403.19559 • Published Mar 28 • 1

Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding

Paper • 2404.05694 • Published Apr 8 • 2

BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models

Paper • 2404.04113 • Published Apr 5 • 3

Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds

Paper • 2404.04031 • Published Apr 5 • 1

Tokenizer Choice For LLM Training: Negligible or Crucial?

Paper • 2310.08754 • Published Oct 12, 2023 • 2

Understanding Back-Translation at Scale

Paper • 1808.09381 • Published Aug 28, 2018 • 1

Revisiting subword tokenization: A case study on affixal negation in large language models

Paper • 2404.02421 • Published Apr 3 • 1

Cross-lingual Named Entity Corpus for Slavic Languages

Paper • 2404.00482 • Published Mar 30 • 3

Fundus: A Simple-to-Use News Scraper Optimized for High Quality Extractions

Paper • 2403.15279 • Published Mar 22 • 1

upvoted 5 papers 4 months ago

CO-Fun: A German Dataset on Company Outsourcing in Fund Prospectuses for Named Entity Recognition and Relation Extraction

Paper • 2403.15322 • Published Mar 22 • 1

MaiBaam: A Multi-Dialectal Bavarian Universal Dependency Treebank

Paper • 2403.10293 • Published Mar 15 • 1

Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages

Paper • 2403.08693 • Published Mar 13 • 1

MaiBaam Annotation Guidelines

Paper • 2403.05902 • Published Mar 9 • 1

Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models

Paper • 2402.18397 • Published Feb 28 • 1

upvoted a collection 4 months ago

LiT5

Collection

Linguistically-Informed T5 models from the LREC-COLING paper "Linguistic Knowledge Can Enhance Encoder-Decoder Models (If You Let It)". • 6 items • Updated Feb 28 • 2

upvoted 9 papers 5 months ago

SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages

Paper • 2402.08638 • Published Feb 13 • 1

Pixel Sentence Representation Learning

Paper • 2402.08183 • Published Feb 13 • 2

Fractal Patterns May Unravel the Intelligence in Next-Token Prediction

Paper • 2402.01825 • Published Feb 2 • 2

Fine-tuning Transformer-based Encoder for Turkish Language Understanding Tasks

Paper • 2401.17396 • Published Jan 30 • 1

SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30 • 23

ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks

Paper • 2401.16589 • Published Jan 29 • 1

DrBERT: Unveiling the Potential of Masked Language Modeling Decoder in BERT pretraining

Paper • 2401.15861 • Published Jan 29 • 1

Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation

Paper • 2305.18893 • Published May 30, 2023 • 2

TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation

Paper • 2401.14373 • Published Jan 25 • 11

upvoted 12 papers 6 months ago

SpacTor-T5: Pre-training T5 Models with Span Corruption and Replaced Token Detection

Paper • 2401.13160 • Published Jan 24 • 9

LangBridge: Multilingual Reasoning Without Multilingual Supervision

Paper • 2401.10695 • Published Jan 19 • 4

Headless Language Models: Learning without Predicting with Contrastive Weight Tying

Paper • 2309.08351 • Published Sep 15, 2023 • 3

Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions

Paper • 2207.14251 • Published Jul 28, 2022 • 1

Cross-lingual Editing in Multilingual Language Models

Paper • 2401.10521 • Published Jan 19 • 2

Mission: Impossible Language Models

Paper • 2401.06416 • Published Jan 12 • 3

RoBERTurk: Adjusting RoBERTa for Turkish

Paper • 2401.03515 • Published Jan 7 • 1

PIXAR: Auto-Regressive Language Modeling in Pixel Space

Paper • 2401.03321 • Published Jan 6 • 1

German Text Embedding Clustering Benchmark

Paper • 2401.02709 • Published Jan 5 • 5

MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining

Paper • 2312.17482 • Published Dec 29, 2023 • 1

Observable Propagation: A Data-Efficient Approach to Uncover Feature Vectors in Transformers

Paper • 2312.16291 • Published Dec 26, 2023 • 1

Language Resources for Dutch Large Language Modelling

Paper • 2312.12852 • Published Dec 20, 2023 • 9

upvoted 8 papers 7 months ago

WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models

Paper • 2112.06598 • Published Dec 13, 2021 • 1

PromptBench: A Unified Library for Evaluation of Large Language Models

Paper • 2312.07910 • Published Dec 13, 2023 • 14

On Meta-Prompting

Paper • 2312.06562 • Published Dec 11, 2023 • 1

Aligner: One Global Token is Worth Millions of Parameters When Aligning Large Language Models

Paper • 2312.05503 • Published Dec 9, 2023 • 1

Gated Linear Attention Transformers with Hardware-Efficient Training

Paper • 2312.06635 • Published Dec 11, 2023 • 3

RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

Paper • 2312.04032 • Published Dec 7, 2023 • 1

Advancing State of the Art in Language Modeling

Paper • 2312.03735 • Published Nov 28, 2023 • 1

SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLM

Paper • 2312.03788 • Published Dec 6, 2023 • 1

Stefan PRO

AI & ML interests

Articles

Fine-tune Flair Models on NER Dataset with 🤗 AutoTrain SpaceRunner

Organizations

stefan-it's activity

Announcing Occiglot-Fineweb