- NVIDIA TensorRT@” is a high performance neural network inference engine for production deployment of deep learning applications.
- Generate optimized, deployment-ready models for inference
- Define and implement unique functionality using the custom layer API
- Deploy neural networks in full (FP32) or reduced precision (INT8, FP16)
- TensorRT can be used to rapidly optimize, validate and deploy trained neural network for inference to hyperscale data centers, embedded, or automotive product platforms.
@nvidia: “New NVIDIA TensorRT delivers high-performance #deeplearning inference runtime:” open tweet »