PinnedDrishti SushmaAccelerate Llama-2–7b Fine-tuning: Unsloth Outpaces Flash Attention-2Objective of this StudyJan 141Jan 141
Drishti SushmaAnalyzing the Impact of lora_alpha on Llama-2 Quantized with GPTQ|Updated: 26/01/24Sep 14, 2023Sep 14, 2023
Drishti SushmaAnalyzing the Dual Impact: Batch Size and Mixed Precision on DistilBERT’s Performance in Language…IntroductionSep 12, 2023Sep 12, 2023
Drishti SushmaComprehensive Evaluation of Various Transformer Models in Detecting Normal, Hate, and Offensive…Objective of the StudySep 11, 2023Sep 11, 2023
Drishti SushmaDecoding the Impact of Weight Decay on MBart-large-50 for English-Spanish TranslationIntroductionSep 11, 2023Sep 11, 2023
Drishti SushmaAnalyzing Llama-2’s Behavior with Varied Pretraining Temperature and Attention MechanismsObjective of the StudySep 11, 2023Sep 11, 2023
Drishti SushmaComparative Study: Training OPT-350M and GPT-2 on Anthropic’s HH-RLHF Dataset Using Reward-Based…IntroductionSep 11, 2023Sep 11, 2023
Drishti SushmaFine-tune 4-bit Llama-2–7B with Flash Attention Using DPOIntroductionSep 11, 2023Sep 11, 2023
Drishti SushmaComparative Analysis of Fine-tuned BERT-based Models for Detecting Hate Speech in Social MediaAbstractSep 7, 2023Sep 7, 2023