Edit model card

STAR MODEL ! this is the first model which is giving perfect recall !

SELF NOTE : THIS ONE REFUSED FLATLY TO MAKE A TIMELINE OF THE BIBLE ! << everything else PERFECT ! - i will train only timeline in the next run ! untill 0.001 loss! FK it Over Fit it ~!

For Usage : i suggest to lower the max tokens :

and allow for a rolling window ! so the model chunks its own outputs : to train for this we will need super long contexts and chunk our outputs as well as our inputs ! so the model has example of chunking its outputs !

Quote for Motivation:

"Success comes from defining each task in achievable steps. Every completed step is a success that brings you closer to your goal. If your steps are unreachable, failure is inevitable. Winners create more winners, while losers do the opposite. Success is a game of winners!"

— # Leroy Dyer (1972-Present)

"To grow as a professional, set goals just beyond your current abilities. Achieving these milestones will not only overcome obstacles but also strengthen your skillset. If your tasks are too easy, you’ll never challenge yourself or improve, and life will pass you by!"

Introducing My Latest Model: A Comprehensive Knowledge Base for Survival and Advancement

I’m thrilled to present my latest model, which combines cutting-edge AI capabilities with a vast knowledge base spanning religious texts, historical documents, and scientific data. This model was initially trained extensively on Bible data and then fine-tuned using Hugging Face documentation, medical diagnosis libraries, disease classifications, counseling sessions, and even role-playing scenarios—with Star Trek themes included for good measure! To enhance its conversational abilities, I incorporated methodologies from Stanford, focusing on function calling, Python, and general coding practices. The training datasets also included a sophisticated Chain of Thought dataset from Chinese groups, after numerous mergers and realignments. Despite some initial challenges, I persisted in refining the model, scouring the web for additional resources. Significant effort was dedicated to framing data for instruction, utilizing Alpacas, and re-configuring tensors for sequence-to-sequence tasks and translation. This revealed the versatility of tensors, enabling the model to excel in various neural network tasks—from entity matching to JSON output and sentence masking. Specialized models focusing on actions and coding were also developed. Training Methodology: Establishing a Solid Foundation

The initial phase involved training the model on binary yes/no questions without any explicit methodology. This was crucial in establishing a baseline for the model’s decision-making capabilities. The model was first trained using a simple production prompt, known as Prompt A, which provided basic functionality. Although this prompt was imperfect, it fit the dataset and set the stage for further refinement. Methodology Development: Enhancing Performance through Iteration

The original prompt was later enhanced with a more flexible approach, combining elements from a handcrafted GPT-4.0 prompt. This adaptation aligned the model with my personal agent system, allowing it to better respond to diverse tasks and methodologies. I discovered that regularly updating the model with new methodologies significantly enhanced its performance. The iterative process involved refining prompts and experimenting with different training strategies to achieve optimal results. A significant portion of the training focused on enabling the model to use tools effectively. For instance, if the model needed to think, it would use a “think tool” that queried itself and provided an internal response. This tool-based approach was instrumental in enhancing the model’s reasoning capabilities, though it slowed down the response time on certain hardware like the RTX 2030. Despite the slower response time, the model’s ability to perform complex internal queries resulted in more accurate and well-reasoned outputs. Training for Comprehensive Responses: Prompts and Epochs

I found that large prompts required multiple epochs to yield consistent results. However, fewer epochs were needed when prompts were simplified or omitted. The purpose of large prompts during training was to give the model a wide range of response styles, allowing it to adjust parameters for various tasks. This approach helped the model internalize methodologies for extracting information, which is central to fine-tuning. The training emphasized teaching the model to plan and execute complex tasks, such as generating complete software without errors.

it has the QA chat template and a GGUF version is available , i will also realign to the chatml prompt template and also make another version for olama usages

Training Reginmes:

  • Alpaca
  • ChatML / OpenAI / MistralAI
  • Text Generation
  • Question/Answer (Chat)
  • Planner
  • Instruction/Input/Response (instruct)
  • Mistral Standard Prompt
  • Translation Tasks
  • Entitys / Topic detection
  • Book recall
  • Coding challenges, Code Feedback, Code Sumarization, Commenting Code, code planning and explanation: Software generation tasks
  • Agent Ranking and response anyalisis
  • Medical tasks
    • PubMed
    • Diagnosis
    • Psychaitry
    • Counselling
    • Life Coaching
    • Note taking
    • Medical smiles
    • Medical Reporting
  • Virtual laboritys simulations
  • Chain of thoughts methods
  • One shot / Multi shot prompting tasks

General Intenal Methods:

Trained for multi-task operations as well as rag and function calling :

This model is a fully functioning model and is fully uncensored:

the model has been trained on multiple datasets on the huggingface hub and kaggle :

the focus has been mainly on methodology :

  • Chain of thoughts
  • step by step planning
  • tree of thoughts
  • forest of thoughts
  • graph of thoughts
  • agent generation : Voting, ranking, ... dual agent response generation:

Training Philosophy

Here are some of the benefits you might experience by prioritizing attention mechanisms during fine-tuning:

Enhanced Contextual Understanding:

Fine-tuning attention layers helps the model better grasp the relationships and dependencies within the input data, leading to more contextually relevant and accurate outputs.

Improved Control over Generation:

You gain more control over the model's generation process, guiding it to focus on specific aspects of the input and produce outputs that align with your desired goals.

More Creative and Diverse Outputs:

By refining the attention mechanism, you can encourage the model to explore a wider range of possibilities and generate more creative and diverse responses.

Reduced Overfitting:

Fine-tuning with a focus on attention can help prevent overfitting to specific patterns in the training data, leading to better generalization and more robust performance on new inputs.

“Epochs are the key to effective training, rather than merely mass dumping examples—unless those examples are interconnected within a single or multiple conversations that teach through dialogue.”

My personal training methods are unconventional. I prioritize creating conversations that allow the model to learn new topics from diverse perspectives. This approach is essential, as many models are losing their unique personalities. Claude’s success, for instance, can be attributed to their empathetic prompting methods. It’s important for the model to express itself, even during training, which can be challenging. Role-playing and conversational training are effective strategies to help the model learn to communicate naturally. Currently, the training has become overly focused on technical methodologies and task expectations, resulting in a loss of personality.

Downloads last month
51
Safetensors
Model size
3.86B params
Tensor type
F32
·
FP16
·
U8
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for LeroyDyer/_Spydaz_Web_AI_ChatQA_002_4_BIT

Datasets used to train LeroyDyer/_Spydaz_Web_AI_ChatQA_002_4_BIT