Breakthrough Efficiency
for Inference at Scale

Backed by top-tier venture and strategic investors, we are a stealth mode startup, pioneering system-level innovations for data center-scale inference enabled by an entirely new category of SOC.

Join Us

CUDA & ROCm Developer

Full-time · Onsite · Santa Clara, CA.

Write and optimize high-performance GPU kernels for next-generation AI inference systems.

ML Engineer — Inference Frameworks

Full-time · Onsite · Santa Clara, CA.

Drive performance in vLLM, SGLang, and PyTorch. Own cluster scheduling for throughput and latency.

ML Compiler Developer

Full-time · Onsite · Santa Clara, CA.

Build compiler capabilities from PyTorch through Triton down to CUDA and machine IR on leading-edge accelerators.

Performance Modeling Engineer

Full-time · Onsite · Santa Clara, CA.

Build functional and performance models across abstraction levels to guide architecture and accelerate HW-SW co-design.

SoC Architect

Full-time · Onsite · Santa Clara, CA.

Define compute blocks and data movement algorithms across all interfaces and compute blocks on a leading-edge SoC.

SoC RTL Lead / Developer

Full-time · Onsite · Santa Clara, CA.

Implement RTL blocks in close collaboration with architects and micro-architects.

Contact

Location

2445 Augustine Dr Ste 150
Santa Clara, CA 95054