Intel(R) Neural Compressor – Medium

Intel(R) Neural Compressor

Pinned

Intel(R) Neural Compressor
in
Intel Analytics Software

Personalized Stable Diffusion with Few-Shot Fine-Tuning

Create Your Own Stable Diffusion on a Single CPU

Nov 1, 2022

Personalized Stable Diffusion with Few-Shot Fine-Tuning

Nov 1, 2022

Intel(R) Neural Compressor
in
Intel Analytics Software

Quantization on Intel Gaudi Series AI Accelerators

Intel Neural Compressor v3.0 Supports Quantization across Intel Hardware

Aug 16

Quantization on Intel Gaudi Series AI Accelerators

Aug 16

Intel(R) Neural Compressor
in
Intel Analytics Software

Accelerating Qwen2 Models with Intel Extension for Transformers

High Performance WOQ INT4 Inference on Intel Xeon Processors

Jun 6

Accelerating Qwen2 Models with Intel Extension for Transformers

Jun 6

Intel(R) Neural Compressor
in
Intel Analytics Software

Accelerating GGUF Models with Transformers

Improving Performance and Memory Usage on Intel Platforms

May 31

Figure1 GGUF format (source: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/gguf-spec.png)

May 31

Intel(R) Neural Compressor
in
Intel Analytics Software

Low-Bit Quantized Open LLM Leaderboard

A New Tool to Find High-Quality Models for a Given Client

May 11

Low-Bit Quantized Open LLM Leaderboard

May 11

Intel(R) Neural Compressor
in
Intel Analytics Software

The AutoRound Quantization Algorithm

Weight-Only Quantization for LLMs Across Hardware Platforms

Apr 2

The AutoRound Quantization Algorithm

Apr 2

Intel(R) Neural Compressor
in
Intel Analytics Software

Run LLMs on Intel GPUs Using llama.cpp

Taking Advantage of the New SYCL Backend

Mar 22

Run LLMs on Intel GPUs Using llama.cpp

Mar 22

Intel(R) Neural Compressor
in
Intel Analytics Software

Efficient Quantization with Microscaling Data Types for Large Language Models

New Quantization Recipes Using Intel Neural Compressor

Mar 1

Efficient Quantization with Microscaling Data Types for Large Language Models

Mar 1

Intel(R) Neural Compressor
in
Intel Analytics Software

Efficient Natural Language Embedding Models with Intel Extension for Transformers

Making Retrieval-Augmented Generation More Efficient

Feb 8

Efficient Natural Language Embedding Models with Intel Extension for Transformers

Feb 8

Intel(R) Neural Compressor
in
Intel Analytics Software

Advancing Large Language Models on Intel Platforms

The Evolution of Intel NeuralChat-7B LLM

Dec 19, 2023

Advancing Large Language Models on Intel Platforms

Dec 19, 2023

Intel(R) Neural Compressor

Intel(R) Neural Compressor

LLM quantization (https://github.com/intel/neural-compressor) and highly-efficient LLM inference (https://github.com/intel/intel-extension-for-transformers)

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams