Edwin Mascarenhas is a distinguished Deep Learning Architect at NVIDIA, where he leverages his extensive expertise in high-performance computing systems to drive innovations in deep learning workloads. With a robust background in performance engineering, computer architecture, and compiler optimizations, Edwin is at the forefront of...
Edwin Mascarenhas is a distinguished Deep Learning Architect at NVIDIA, where he leverages his extensive expertise in high-performance computing systems to drive innovations in deep learning workloads. With a robust background in performance engineering, computer architecture, and compiler optimizations, Edwin is at the forefront of enhancing the efficiency and scalability of NVIDIA's GPU offerings. His current role involves intricate performance analysis of deep learning workloads, focusing on optimizing GEMM (General Matrix Multiply) operations, which are critical for training and inference in large language models (LLMs).
One of Edwin's key projects includes conducting detailed ablation studies to quantify the impact of various GPU features on end-to-end LLM inference performance. By meticulously analyzing tiny GEMM overheads and projecting performance for future datacenter GPUs, he is instrumental in shaping the architecture that supports next-generation AI applications. His work with GEMM microbenchmarks and their correlation with performance simulators not only enhances the understanding of GPU capabilities but also informs strategic decisions for future hardware development.
Edwin's proficiency in programming languages such as C, C++, and Python, combined with his experience in GPGPU and parallel computing, enables him to develop high-performance code that maximizes the potential of NVIDIA's cutting-edge technology. His passion for building fast computing systems is evident in his commitment to pushing the boundaries of what is possible in deep learning, making him a vital asset to NVIDIA's mission of advancing AI and machine learning capabilities across industries.