Alon AgmoninTowards Data ScienceScale Up Your RAG: A Rust-Powered Indexing Pipeline with LanceDB and CandleBuilding a High-Performance Embedding and Indexing System for Large-Scale Document Processing and RetrievalJul 11Jul 11
Alon AgmoninTowards Data ScienceStreamlining Serverless ML Inference: Unleashing Candle Framework’s Power in RustBuilding a lean and robust model serving layer for vector embedding and search with Hugging Face’s new Candle FrameworkDec 21, 20231Dec 21, 20231
Alon AgmoninTowards Data ScienceData Access API over Data Lake Tables Without the ComplexityBuild a robust GraphQL API service on top of your S3 data lake files with DuckDB and Go.Sep 28, 20232Sep 28, 20232
Alon AgmoninTowards Data ScienceConcurrently Train Multiple Time Series Models Over Spark with XGBoostTake advantage of the distributive power of Apache Spark and concurrently train thousands of auto-regressive time-series models on big dataMar 17, 2023Mar 17, 2023
Alon AgmoninTowards Data ScienceBoost Your Cloud Data Applications with DuckDB and Iceberg APIUse Iceberg API with DuckDB to optimize analytics queries on massive Iceberg tables in your cloud storageDec 23, 20223Dec 23, 20223
Alon AgmoninTowards Data ScienceML prediction on streaming data using Kafka StreamsBoost the performance of Python-trained ML models by serving them over Kafka streaming platform in a Scala applicationJul 11, 20221Jul 11, 20221
Alon AgmoninTowards Data ScienceHands-on Anomaly Detection with Variational AutoencodersDetect anomalies in tabular data using Bayesian-style reconstruction methodsJul 30, 2021Jul 30, 2021
Alon AgmoninTowards Data ScienceUsing Gaussian Mixture Models to Transform User-Item Embedding and Generate Better User ClustersImprove clustering of user-item embedding by using GMM to generate new and tighter featuresApr 16, 20211Apr 16, 20211
Alon AgmoninTowards Data ScienceDeep Clustering with Sparse DataA rather “shallow” and simple approach to deep clustering of highly dimensional data using Keras and manifold learning in 3 simple stepsNov 29, 20201Nov 29, 20201
Alon AgmoninTowards Data ScienceCreate a Recommendation System Based on Time-Series Data Using Latent Dirichlet AllocationUse the classic NLP topic modelling technique to cluster users according to their habits and preferencesSep 7, 2020Sep 7, 2020