Edit model card

Table of Contents

  1. TL;DR
  2. Model Details
  3. Usage
  4. Uses
  5. Citation

TL;DR

This is a FLAN-T5 model trained on ArtifactAI/arxiv-cs-ml-instruct-tune-50k. This model is for research purposes only and should not be used in production settings. The output is highly unreliable.

Model Details

Model Description

  • Model type: Language model
  • Language(s) (NLP): English
  • License: Apache 2.0
  • Related Models: All FLAN-T5 Checkpoints

Usage

Find below some example scripts on how to use the model in transformers:

Using the Pytorch model

Running the model on a CPU


from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")
model = T5ForConditionalGeneration.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")

input_text = "What is an LSTM?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))

Running the model on a GPU

# pip install accelerate
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")
model = T5ForConditionalGeneration.from_pretrained("ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering", device_map="auto")

input_text = "What is an LSTM?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))

Running the model in an HF pipeline

FP16

# load model and tokenizer from huggingface hub with pipeline
qa = pipeline("summarization", model="ArtifactAI/flan-t5-base-arxiv-cs-ml-question-answering")


query = "What is an LSTM?"
print(f"query: {query}")
res = qa("answer: " + query)

print(f"{res[0]['summary_text']}")

Training Details

Training Data

The model was trained on ArtifactAI/arxiv-cs-ml-instruct-tune-50k, a dataset of question/answer pairs. Questions are generated using the t5-base model, while the answers are generated using the GPT-3.5-turbo model.

Citation

@misc{flan-t5-base-arxiv-cs-ml-question-answering,
    title={flan-t5-base-arxiv-cs-ml-question-answering},
    author={Matthew Kenney},
    year={2023}
}
Downloads last month
4
Safetensors
Model size
248M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.