Deci.AI is on a mission to empower AI developers by providing them with robust tools to create innovative AI-based solutions. Our goal is to guarantee that these models not only excel in production but also unlock their full potential.
Join us in shaping the future of AI, where innovation meets empowerment at every level.
What you’ll do
As part of the Deci AI Inference Team, you will take part in the development of Deci’s core products - enabling graph compilation, runtime optimization, model deployment and more - all aimed at squeezing the most out of Deci’s customers’ HW. The inference team optimizes a range of DL models - from classic vision models on edge devices to LLMs running on cloud machines. You will work closely with our research scientists, software engineers, and product team to ensure efficient and accurate model deployment.
Requirements
- 3+ years of experience with hands-on development in a performance-oriented environment
- Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field (or equivalent professional experience)
- Strong knowledge of software development best practices, design patterns, and code quality
- Strong problem-solving skills and the ability to analyze and optimize complex systems using methodical approaches
- Team player with a positive attitude, good vibes, and a driven mindset
Preferred Qualification
- Strong background in deep learning algorithms - from classic neural networks to CNNs, RNNs and Transformers
- Hands-on experience with deep learning inference frameworks or runtimes (PyTorch, TensorFlow2, TFLite, TensorRT, OpenVino, ….)
- Familiarity with generative AI models - from large language models to Stable Diffusion, Whisper and more
- Experience with client-server infrastructures and ability to implement performant request-handling server logic
- Proficiency in CUDA programming for GPU acceleration, enhancing model performance, and optimizing computational efficiency
Responsibilities
- Develop the core logic behind Deci’s DL development platform. Contribute to inference libraries and internal model optimization tools used by Deci’s customers, researchers, and algorithm teams
- Develop strategies and infrastructure aimed at improving the reliability of Deci’s deep learning systems
- Develop and deliver production-grade, high-throughput real-time inference-enabling frameworks
- Adopt and integrate cutting-edge research - in the fields of deep learning model optimization and deployment - into Deci products and tools