Build a production AI inference engine from scratch in C++. Covers operator implementation, memory layout, quantisation (INT8/FP16), batching strategies, CUDA kernel integration, and deployment as a shared library consumed by Python runtimes.
Duration
14 hours
Students
180+
Rating
⭐ 4.6 (14)
After enrolment you will receive access instructions by email within 24 hours. Course materials are delivered via our online learning portal.
“Practical, rigorous, and immediately applicable. Velmio courses are genuinely different.”
Mark T.
Head of Data
Order Summary
C++ and AI Inference Engines
By enrolling you agree to our Terms & Conditions and Privacy Policy.