Software Engineer (Performance Optimization focus)

Company Overview

Calling the adventurers ready to join a company that's pushing the limits of nanotechnology to keep the digital revolution rolling. At KLA-Tencor, we're making technology advancements that are bigger—and tinier—than the world has ever seen.

Who are we?  We research, develop, and manufacture the world's most advanced inspection and measurement equipment for the semiconductor and nanoelectronics industries. We enable the digital age by pushing the boundaries of technology, creating tools capable of finding defects smaller than a wavelength of visible light. We create smarter processes so that technology leaders can manufacture high-performance chips—the kind in that phone in your pocket, the tablet on your desk and nearly every electronic device you own—faster and better. We're passionate about creating solutions that drive progress and help people do what wouldn't be possible without us.  The future is calling. Will you answer?


The Wafer Inspection Division (WIN) is the world leader in the design and manufacture of advanced optical inspection tools for inline monitoring of process defects in advanced semiconductor factories.

The AI Group within WIN is chartered with exploring and qualifying new hardware and software technologies to be used by future wafer inspector products.


This job involves optimizing code to run on GPUs.  Using prebuillt libraries (e.g. cuBLAS, NPP) will be part of the job, but also coding and performance tuning new functions using CUDA.  Full stack optimization is desired, so a willingness to explore and suggest alternate, more GPU friendly, algorithms to accomplish a given task is a plus as is the ability to recognize when a given algorithm step should be left on the CPU.

We provide the hardware on which the code runs, so a 20% overall speedup can mean a ~20% savings in costs.  Part of the job involves building spreadsheet models of the expected runtime (with good optimization) so that we know how close to entitlement we are.  The job will involve cycle counting as well as memory usage analysis for things such as bandwidth and working set size.

Preferred Qualifications

3+ years programming experience (Assembly, C or C++ is a plus)

1+ years performance tuning/optimization

SIMD experience (e.g. AltiVec, SSE) or CUDA experience

Minimum Qualifications

Doctorate (Academic) with at least 3 years of experience.
Master's Level Degree with at least 5 years of experience.
Bachelor's Level Degree with at least 7 years of experience.

Equal Employment Opportunity

KLA-Tencor is an Equal Opportunity Employer. Applicants will be considered for employment without regard to age, race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability, or any other characteristics protected by applicable law.

