PinnedHaitham Bou AmmarinBecoming Human: Artificial Intelligence MagazineSafe Reinforcement Learning — Part IMany practitioners using reinforcement learning (RL) are often concerned about the safety of SOTA deep RL techniques. With that in mind…Nov 14, 2022Nov 14, 2022
Haitham Bou AmmarFrom MCTS to Alpha-Zero with PyTorch — Part I (Building a Tic-Tac-Toe’r)AlphaZero is a deep reinforcement learning algorithm developed by DeepMind that has achieved superhuman performance in games like Chess…Sep 9Sep 9
Haitham Bou AmmarNew Grounds in Theorem Proving with DeepSeek-Prover-V1.5DeepSeek-Prover-V1.5 represents a significant leap forward from its predecessor, DeepSeek-Prover-V1. This new iteration is designed for…Aug 18Aug 18
Haitham Bou AmmarDeriving DPO’s LossDirect preference optimisation has become critical for aligning LLMs with human preferences. I have been talking to many people about it…Aug 15Aug 15
Haitham Bou AmmarPluralistic Alignment of LLMs: Fix your Algorithm not just your dataInterjection: Recent studies have found that large language models (LLMs) are biased, with many articles demonstrating these biases and…Jul 22Jul 22
Haitham Bou AmmarA Leap Towards Human-Like AI: Recreating Human Memory in LLMsIn artificial intelligence, large language models (LLMs) have demonstrated remarkable capabilities in understanding and generating…Jul 20Jul 20
Haitham Bou AmmarTransforming Robot Programming with Language AIIntuitive robot programming enables non-experts to interact with and control robotic systems effectively. Our ROS-LLM framework allows you…Jul 19Jul 19
Haitham Bou AmmarOptimal Control from Natural LanguageGenerating model predictive control without domain expertise via large language models!Feb 13Feb 13