Edit model card

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Model type: Mamba

  • Language(s) (NLP): Japanese

  • Tokenizer: ku-nlp/gpt2-large-japanese-char

  • License: Model: Aptach 2.0, Tokenizer: CC-BY-SA

  • Run the model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "misdelivery/mamba-char-japanese-790m"
tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(model_id)

input_prompt = "夏目漱石について教えてください。"

with torch.no_grad():
    input_ids = tokenizer.encode(f"以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。\n\n### 指示:\n{input_prompt}\n\n### 応答:\n", add_special_tokens=False, return_tensors="pt")
    output_ids = model.generate(
    input_ids.to(model.device),
    max_length=512,
    do_sample=True,
    temperature=0.6,
    repetition_penalty=1.2
)

output_ids.tolist()[0]
result = tokenizer.decode(output_ids.tolist()[0], skip_special_tokens=True)
print(result)

【LOCAL AI HACKATHON #001】240時間ハッカソンにおいてGPUをお借りしました。

関係者の方々に深く御礼申し上げます。

メタデータラボ株式会社様 【AI声づくり技術研究会】 サーバー主:やなぎ(Yanagi)様 (@Yanagi_1112) 【ローカルLLMに向き合う会】 サーバー主:saldra(サルドラ)様 (@sald_ra) Witness様 (@i_witnessed_it)

チームメンバー: hayashi, kzms, chatblanc, Ryunosuke Ikeda

Downloads last month
8
Safetensors
Model size
725M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.