Build a production AI inference engine from scratch in C++. Covers operator implementation, memory layout, quantisation (INT8/FP16), batching strategies, CUDA kernel integration, and deployment as a shared library consumed by Python runtimes.
Duration
14 hours
Students
720+
Rating
⭐ 4.9
“Practical, rigorous, and immediately applicable. Velmio courses are genuinely different.”
Mark T.
Head of Data, NHS Digital
Order Summary
C++ and AI Inference Engines
By enrolling you agree to our Terms and Privacy Policy.