PinnedMichael HumorinGoPenAIBuild your own AI PC (Part I): setting up LLM daemons on Darwin (MacOS)Towards building your own AI PC, this tutorial shows how to set up LLM as system-level daemons on Darwin (MacOS).Mar 241Mar 241
Michael HumorinGoPenAILlama 3.1 vs Llama 3 DifferencesIt seems Llama 3.1 outperforms Llama 3 significantly in terms of math and reasoning capabilities. For instance, according to the Meta’s…Aug 7Aug 7
Michael HumorinGoPenAIWhat LLM quantization works best for you? Q4_K_S or Q4_K_MIf you are working with a giant LLM, quantization is your friend to optimize performance and speed. There are so many different…Apr 261Apr 261
Michael HumorinDev GeniusLlama-3 8B Model StatsLlama-3 8B with 4-bit quantization only needs around 4GB of RAM to run on a PC.Apr 26Apr 26
Michael HumorinDev GeniusA single script to install Docker on Linux VM (Microsoft Azure)Here it is:Apr 26Apr 26
Michael HumorinGoPenAIHow to build llama.cpp on Windows with NVIDIA GPU?If you have RTX 3090/4090 GPU on your Windows machine, and you want to build llama.cpp to serve your own local model, this tutorial shows…Apr 12Apr 12
Michael HumorGrok 1.0 Model StatsxAI’s Grok 1.0 model (see Github repo) has 64 layers, 8K context length, in total 314B parameters.Mar 311Mar 311
Michael HumorinDev GeniusWhat’s a System Prompt for AI?In short, a “system prompt” is a specialized type of prompt that sets the context for the AI’s interactions.Mar 22Mar 22
Michael HumorinDev GeniusThe TAO of Prompt Engineering (Part-2): writing an email assistantIn the last article, we have introduced TAO (Thought-Action-Observation), a method for LLM prompt engineering. In this article, we focus on…Mar 21Mar 21
Michael HumorinDev GeniusThe TAO of Prompt Engineering (Part-1): understanding the ReAct frameworkIn this article, we introduce a method for prompt engineering called TAO (Thought-Action-Observation), inspired by ReAct (Reason+Act) for…Mar 211Mar 211