Data First Jobs

TetraMem - Accelerate The World

Senior Machine Learning Engineer

Full Time · In Office · San Jose, California (USA)

Posted Jun 8, 2026

Work Options
Seniority Level
Skills
Job Type
Position Group

Responsibilities

  • Develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing.
  • Implement and optimize ML models on embedded platforms, including FPGA and custom ASIC solutions.
  • Work closely with hardware and software teams to integrate ML models into production systems.
  • Research and implement state-of-the-art ML techniques to enhance model efficiency, latency, and power consumption for embedded AI applications.
  • Improve inference efficiency and model compression techniques, including quantization, pruning, and knowledge distillation.
  • Collaborate with cross-functional teams to drive innovation and contribute to the overall system architecture.
  • Provide technical leadership and mentorship to junior engineers.
  • Publish research findings, present at conferences, and contribute to open-source projects when applicable.

Requirements

  • 5+ years of relevant industry experience (or a PhD) in Computer Science, Electrical Engineering, Machine Learning, or related fields.
  • Must have prior experience managing a team, serving in a Team Lead role, or demonstrating strong technical leadership and cross-functional coordination capabilities.
  • Strong hands-on experience in machine learning, with a focus on edge AI, on-device inference, and deploying lightweight models on resource-constrained devices.
  • Expertise in modern ML frameworks such as PyTorch, TensorFlow (including TensorFlow Lite), and JAX.
  • Proficiency in Python and C/C++, with practical experience in ML model optimization and production deployment.
  • Deep experience with model quantization (PTQ/QAT), pruning, knowledge distillation, sparsity, and other compression techniques for efficient edge inference.
  • Hands-on experience developing for or integrating with AI chip SDKs, neural accelerators (NPUs/DSPs), or hardware-specific toolchains (e.g., NVIDIA TensorRT, Qualcomm Neural Processing SDK, ARM Ethos, or similar).
  • Familiarity with edge inference runtimes (ONNX Runtime, ExecuTorch, TVM) and optimizing models for hardware constraints (latency, memory footprint, power consumption).

Experience in one or more of the following areas considered a strong plus:

  • Understanding of ML compiler and runtime design.
  • Experience working with tools such as Optimum, ONNX, TensorRT, TFLite/LiteRT, ncnn, or CoreML.
  • Familiarity with hardware acceleration techniques.
  • Experience in embedded system development.

Salary Range: $200,000 - $280,000 / year

Mention you found this on Data First Jobs — it helps us bring you more roles like this.

Senior Machine Learning Engineer

TetraMem - Accelerate The World

Like this role? Get carefully selected jobs like it, twice a week, straight to your inbox.

Free, no spam. Unsubscribe anytime.