Senior AI Lead Engineer: Device Parallelism Research Specialist

Company Overview

Calling the adventurers ready to join a company that's pushing the limits of nanotechnology to keep the digital revolution rolling. At KLA, we're making technology advancements that are bigger—and tinier—than the world has ever seen.

Who are we?  We research, develop, and manufacture the world's most advanced inspection and measurement equipment for the semiconductor and nanoelectronics industries. We enable the digital age by pushing the boundaries of technology, creating tools capable of finding defects smaller than a wavelength of visible light. We create smarter processes so that technology leaders can manufacture high-performance chips—the kind in that phone in your pocket, the tablet on your desk and nearly every electronic device you own—faster and better. We're passionate about creating solutions that drive progress and help people do what wouldn't be possible without us.  The future is calling. Will you answer?


With over 40 years of semiconductor process control experience, chipmakers around the globe rely on KLA to ensure that their fabs ramp next-generation devices to volume production quickly and cost-effectively. KLA's Global Products Group (GPG), which is responsible for creating all of KLA’s metrology and inspection products, is looking for forward-thinking research scientists, software engineers, application development engineers, and senior product technology process engineers to join our team and enable the movement towards advanced chip design. 

About KLA Advanced Computing Labs, India:
KLA advanced computing Labs’ (ACL) mission in India is to deliver advanced parallel computing research and software architectures for AI + HPC + Cloud solutions to accelerate KLA’s product performance. This team explores high-risk approaches, pioneering technologies, and novel methods to accelerate KLA’s algorithms and contribute to KLA’s HPC technology roadmap. We engage leading thinkers in academia, industry and KLA’s business units to create innovative parallel computing methods to enable KLA’s business growth.


Position: Senior AI Lead Engineer: Device Parallelism Research Specialist
KLA is hiring engineers for its Advanced Computing Labs in Chennai, India. KLA ACL is at our new research center in the IITM, Research Park. The goal of the center is to conduct computational research in parallel and distributed sub-systems and deploy them to KLA’s advanced semi-conductor platforms that are used for inspection and metrology tasks in leading fabs. These efforts are part of a larger global initiative at KLA to scale up its AI + HPC + cloud infrastructure.
What will you be responsible for?
Specifically, we are looking for an experienced engineer with a deep understanding of various parallel computing models (SIMD, SIMT) along with a background knowledge of image processing and signal processing algorithms. The role of the job is to enable our R&D groups to evaluate various parallel HW architectures (such as GPUs) along with their associated SW frameworks (CUDA, SYCL, SSE/AVX) to ensure that the most critical KLA AI algorithms can run optimally. In a nutshell the position requires a strong and versatile computational architecture background in a range of areas from C/C++ to CUDA to parallel threading models. The position also requires a person with significant communication and technical leadership skills and the ability to navigate from relatively high-level requirements to low level computational models.


What we would like to see?
3-5+ years of proven expertise in parallel computing optimization skills such as CUDA (preferred) or x86 SIMD (SSE, AVX etc.). Besides deep knowledge in parallel computing significant weightage will be given to understanding of core parallel computing loads in the areas of image & signal processing as well as computational graphs.
We expect the candidate to have a strong background to decipher the nature of computational loads whether they are bandwidth limited (Roofline models) or computational loads.
Concepts such as threads parallelism, process parallelism and the ability to successfully use various Linux profiling tools (such as NSIGHT or VTUNES) will also be essential for success in this role.
Minimum Education: BS/B. Tech in Computer Science/Math/Physics. Dual degree masters and Ph.D. preferred. 
Finally, the candidate must be self-driven, curious and have the ability to navigate successfully with global teams including interacting with senior technical SW + HPC architects.
Bonus skills:
Any prior experience in KLA domains such as wafer inspection coupled with programming in CUDA or AVX will be a very plus. Additionally, any experience in optimizing large scale signal or computer vision algorithms would also be a major plus. It
Experience in FPGA programming while not essential will also be a major plus.
Experience in large scale distributed HPC systems, proven experience in Docker and Container orchestration and any expertise in AI Frameworks (TensorFlow) will also be welcome.
Finally a strong background in Modern C++ concepts (C++ 11 through C++ 17), STL library would also be a way to stand out from the crowd.