Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

😎 KoChatBART

BART(Bidirectional and Auto-Regressive Transformers)λŠ” μž…λ ₯ ν…μŠ€νŠΈ 일뢀에 λ…Έμ΄μ¦ˆλ₯Ό μΆ”κ°€ν•˜μ—¬ 이λ₯Ό λ‹€μ‹œ μ›λ¬ΈμœΌλ‘œ λ³΅κ΅¬ν•˜λŠ” autoencoder의 ν˜•νƒœλ‘œ ν•™μŠ΅μ΄ λ©λ‹ˆλ‹€. ν•œκ΅­μ–΄ μ±„νŒ… BART(μ΄ν•˜ KoChatBART) λŠ” λ…Όλ¬Έμ—μ„œ μ‚¬μš©λœ Text Infilling λ…Έμ΄μ¦ˆ ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ μ•½ 10GB μ΄μƒμ˜ ν•œκ΅­μ–΄ λŒ€ν™” ν…μŠ€νŠΈμ— λŒ€ν•΄μ„œ ν•™μŠ΅ν•œ ν•œκ΅­μ–΄ encoder-decoder μ–Έμ–΄ λͺ¨λΈμž…λ‹ˆλ‹€. 이λ₯Ό 톡해 λ„μΆœλœ λŒ€ν™” 생성에 κ°•κ±΄ν•œ KoChatBART-baseλ₯Ό λ°°ν¬ν•©λ‹ˆλ‹€.

Quick tour

from transformers import AutoTokenizer, BartForConditionalGeneration
  
tokenizer = AutoTokenizer.from_pretrained("BM-K/KoChatBART")
model = BartForConditionalGeneration.from_pretrained("BM-K/KoChatBART")

inputs = tokenizer("μ•ˆλ…• 세상아!", return_tensors="pt")
outputs = model(**inputs)

사전 ν•™μŠ΅ 데이터 μ „μ²˜λ¦¬

μ‚¬μš©ν•œ 데이터셋

KoChatBARTλ₯Ό ν•™μŠ΅μ‹œν‚€κΈ° μœ„ν•˜μ—¬ ν•œκ΅­μ–΄ λŒ€ν™” 데이터셋듀을 μ „μ²˜λ¦¬ ν›„ 합쳐 λŒ€λŸ‰μ˜ ν•œκ΅­μ–΄ λŒ€ν™” λ§λ­‰μΉ˜λ₯Ό λ§Œλ“€μ—ˆμŠ΅λ‹ˆλ‹€.

  1. λ°μ΄ν„°μ˜ 쀑볡을 쀄이기 μœ„ν•΄ 'γ…‹γ…‹γ…‹γ…‹γ…‹γ…‹'와 같은 μ€‘λ³΅λœ ν‘œν˜„μ΄ 2번 이상 반볡될 λ•ŒλŠ” 'γ…‹γ…‹'와 같이 2번으둜 λ°”κΏ¨μŠ΅λ‹ˆλ‹€.
  2. λ„ˆλ¬΄ 짧은 λ°μ΄ν„°λŠ” ν•™μŠ΅μ— λ°©ν•΄κ°€ 될 수 있기 λ•Œλ¬Έμ— KoBART ν† ν¬λ‚˜μ΄μ € κΈ°μ€€ 전체 토큰 길이가 3을 λ„˜λŠ” λ°μ΄ν„°λ§Œμ„ μ„ λ³„ν–ˆμŠ΅λ‹ˆλ‹€.
  3. κ°€λͺ…μ²˜λ¦¬λœ λ°μ΄ν„°λŠ” μ œκ±°ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

Model

Model # of params vocab size Type # of layers # of heads ffn_dim hidden_dims
KoChatBART 139M 50265 Encoder 6 16 3072 768
Decoder 6 16 3072 768

λŒ€ν™” 생성 μ„±λŠ₯ μΈ‘μ •

λ‹€μŒ μ½”λ“œ(Dialogue Generator)λ₯Ό 기반으둜 각 λͺ¨λΈμ„ fine-tuning ν•˜μ˜€μŠ΅λ‹ˆλ‹€. λŒ€ν™” 생성 μ„±λŠ₯ 츑정을 μœ„ν•΄ μΆ”λ‘  μ‹œ ν† ν¬λ‚˜μ΄μ§•λ˜μ–΄ μƒμ„±λœ 응닡을 λ³΅μ›ν•œ ν›„, BPE tokenizerλ₯Ό μ‚¬μš©ν•˜μ—¬ μ‹€μ œ 응닡과 μƒμ„±λœ 응닡 μ‚¬μ΄μ˜ overlap 및 distinctλ₯Ό μΈ‘μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

Warning
일반적으둜 짧은 λŒ€ν™” λ°μ΄ν„°λ‘œ λͺ¨λΈμ„ μ‚¬μ „ν•™μŠ΅ν•˜μ˜€κΈ° λ•Œλ¬Έμ— κΈ΄ λ¬Έμž₯ μ²˜λ¦¬κ°€ μš”κ΅¬λ˜λŠ” νƒœμŠ€ν¬(μš”μ•½) 등에 λŒ€ν•΄μ„œλŠ” μ•½ν•œ λͺ¨μŠ΅μ„ λ³΄μž…λ‹ˆλ‹€.

μ‹€ν—˜ κ²°κ³Ό

Training Validation Test
9,458 1,182 1,183
Model Param BLEU-3 BLEU-4 Dist-1 Dist-2
KoBART 124M 8.73 7.12 16.85 34.89
KoChatBART 139M 12.97 11.23 19.64 44.53
KoT5-ETRI 324M 12.10 10.14 16.97 40.09
Training Validation Test
29,093 1,616 1,616
Model Param BLEU-3 BLEU-4 Dist-1 Dist-2
KoBART 124M 10.04 7.24 13.76 42.09
KoChatBART 139M 10.11 7.26 15.12 46.08
KoT5-ETRI 324M 9.45 6.66 14.50 45.46

Contributors

Reference

Downloads last month
53
Safetensors
Model size
139M params
Tensor type
F32
Β·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.