Harry Thasarathan

Graduate Researcher

York University

I am a PhD student at York University and a member of CVIL, supervised by Kosta Derpanis. My research uses vision models as a testbed for developing interpretability methods that enable both scientific understanding and practical control of neural network representations.

Previously, I completed my Master’s with Kosta Derpanis and Marcus Brubaker, where I worked on long-term 3D human pose forecasting through keyframe-based motion representations. During my undergraduate studies at Ontario Tech, I explored structured generative models for 3D human locomotion with Faisal Qureshi and Ken Pu, and worked on temporal coherence in GANs for video colorization and super-resolution with Mehran Ebrahimi and Kamyar Nazeri.

Interests

Computer Vision
Interpretability
Generative Modelling
Scientific Discovery

Education

PhD Computer Science

York University
MSc Computer Science, 2023

York University
BSc Computer Science, 2021

University of Ontario Institute of Technology

Publications

Harrish Thasarathan, Julian Forsyth, Thomas Fel, Matthew Kowal, Konstantinos Derpanis

February 2025 International Conference on Machine Learning (ICML), 2025

Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment

We present a framework for discovering interpretable concepts that span multiple neural networks. We develop a single sparse autoencoder that can process activations from different models and reconstruct activations across models. Our method identifies semantically coherent and important universal concepts ranging from low-level features like colors and textures to higher-level structures including parts and objects, enabling cross-model analysis.

PDF Code

Harrish Thasarathan, Mehran Ebrahimi

January 2020 The IEEE International Conference on Computer Vision (ICCV) Workshops, 2019

Artist-Guided Semiautomatic Animation Colorization

A continuation of our work on automatic animation colorization, we extend our method to keep artist’s in the loop to simultaneously preserve artistic vision and reduce tedious animation workloads. By incorporating color hints and temporal information to an adversarial image-to-image framework, we show that it is possible to meet the balance between automation and authenticity through artist’s input to generate colored frames with temporal consistency.

PDF

Harrish Thasarathan, Kamyar Nazeri, Mehran Ebrahimi

January 2019 The 16th Conference on Computer and Robot Vision, 2019

Automatic Temporally Coherent Video Colorization

Greyscale image colorization for applications in image restoration has seen significant improvements in recent years. Many of these techniques that use learning-based methods struggle to effectively colorize sparse inputs. With the consistent growth of the anime industry, the ability to colorize sparse input such as line art can reduce significant cost and redundant work for production studios by eliminating the in-between frame colorization process. Simply using existing methods yields inconsistent colors between related frames resulting in a flicker effect in the final video. In order to successfully automate key areas of large-scale anime production, the colorization of line arts must be temporally consistent between frames. This paper proposes a method to colorize line art frames in an adversarial setting, to create temporally coherent video of large anime by improving existing image to image translation methods. We show that by adding an extra condition to the generator and discriminator, we can effectively create temporally consistent video sequences from anime line arts.

PDF

Kamyar Nazeri, Harrish Thasarathan, Mehran Ebrahimi

January 2019 The IEEE International Conference on Computer Vision (ICCV) Workshops, 2019

Edge-Informed Single Image Super-Resolution

The recent increase in the extensive use of digital imaging technologies has brought with it a simultaneous demand for higher-resolution images. We develop a novel edge-informed approach to single image super-resolution (SISR). The SISR problem is reformulated as an image inpainting task. We use a two-stage inpainting model as a baseline for super-resolution and show its effectiveness for different scale factors (x2, x4, x8) compared to basic interpolation schemes. This model is trained using a joint optimization of image contents (texture and color) and structures (edges). Quantitative and qualitative comparisons are included and the proposed model is compared with current state-of-the-art techniques. We show that our method of decoupling structure and texture reconstruction improves the quality of the final reconstructed high-resolution image.

PDF