metadata

language: zh
datasets: c2m
inference:
  parameters:
    max_length: 108
    num_return_sequences: 1
    do_sample: true
widget:
  - text: 往者不可谏,来者犹可追。
    example_title: 往来
  - text: 逝者如斯夫！不舍昼夜。
    example_title: 誓不

文言文 to 现代文

Model description

How to use

使用 pipeline 调用模型:

>>> from transformers import pipeline
>>> model_checkpoint = "supermy/c2m"
>>> translator = pipeline("translation", 
        model=model_checkpoint,
        num_return_sequences=1,
        max_length=52, 
        truncation=True,)

>>> translator("往者不可谏,来者犹可追。")
[{'translation_text': '过 去 的 事 情 不能 劝 谏 ， 未来 的 事 情 还 可以 追 回 来 。 如 果 过 去 的 事 情 不能 劝 谏 ， 那 么 ， 未来 的 事 情 还 可以 追 回 来 。 如 果 过 去 的 事 情'}]

>>> translator("福兮祸所伏，祸兮福所倚。",do_sample=True)
[{'translation_text': '幸 福 是 祸 患 所 隐 藏 的 ， 灾 祸 是 福 祸 所 依 托 的 。 这 些 都 是 幸 福 所 依 托 的 。 这 些 都 是 幸 福 所 带 来 的 。 幸 福 啊 ， 也 是 幸 福'}]

>>> translator("成事不说，遂事不谏，既往不咎。", num_return_sequences=1,do_sample=True)
[{'translation_text': '事 情 不 高 兴 ， 事 情 不 劝 谏 ， 过 去 的 事 就 不 会 责 怪 。 事 情 没 有 多 久 了 ， 事 情 没 有 多 久 ， 事 情 没 有 多 久 了 ， 事 情 没 有 多'}]

>>> translator("逝者如斯夫！不舍昼夜。",num_return_sequences=1,max_length=30)
[{'translation_text': '逝 去 的 人 就 像 这 样 啊 ， 不分 昼夜 地 去 追 赶 它 们 。 这 样 的 人 就 不 会 忘 记'}]

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("supermy/c2m")
model = AutoModelForSeq2SeqLM.from_pretrained("supermy/c2m")
text = "用你喜欢的任何文本替换我。"
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

Training data

Training procedure

文言文数据集训练数据. Helsinki-NLP Helsinki-NLP 模型:


###  entry and citation info