Max Ehrlich

I am a Research Scientist at NVIDIA hardware engineering and an Adjunct Assistant Professor at the University of Maryland Computer Science Department and UMIACS.

My current research combines machine learning and computational imaging to solve real problems. My focus is on breaking down and understanding the first principles of the problem and then building these principles back up into a machine learning solution rather than treating the model as a black box.

In the past I have successfully applied this idea to image enhancement. The broader impact of this is to improve participation from underrepresented groups For example, by creating better multimedia compression algorithms which incorporate simple deep learning based techniques, people operating in underinvested locations (e.g., rural areas, native american reservations, 3rd world countries) are able to participate in an increasingly media-focused internet. I am grateful to have had recognition of the importance of this work by many funding partners over the years including government agencies: DARPA and IARPA, and private companies: Facebook AI, Adobe DIL, and NVIDIA ADLR (where I currently work).

I received my Ph.D. in Computer Science from the University of Maryland where I was co-advised by Professor Larry Davis and Professor Abhinav Shrivastava. I received an M.S. in Computer Science from Stevens Institute of Technology. where I was advised by Professor Philippos Mordohai and a B.S. in Computer Science from Rutgers University.


I am a Member of
Association for the Advancement of Artificial Intelligence (AAAI)
Institute of Electrical and Electronics Engineers (IEEE)
Computer Vision Foundation (CVF)

3/24 - I am co-organizing a CVPR 2024 workshop on Implicit Neural Representations for Vision, please submit your papers!
2/24 - Our paper on explaning INRs, XINC, was accepted to CVPR 2024.
1/24 - A preprint of our paper on explaning INRs, XINC, is now available on arXiv.
10/23 - Metabit, our algorithm for generalized correction of compressed videos, was accepted to WACV 2024.
8/23 - Our paper on frequency analysis of adversarial examples was accepted to BMVC 2023.
5/23 - Appointed as an Adjunct Assistant Professor at UMD.
2/23 - Our paper on neural compression with implicit representations was accepted to CVPR 2023.
11/22 - Awarded the Larry S. Davis Doctoral Dissertation Award.


Service

Conference Reviewer: AAAI 2020, ICLR 2020, ECCV {2020-2024}, IJCAI 2021, CVPR {2021-2024}, ICML 2021, ICCV {2021-2023}, WACV {2022-2024}

Journal Reviewer: Transactions on Image Processing (TIP), International Journal of Artifical Intelligence (IJAI), The Visual Computer (TVCJ), Transactions on Circuits and Systems for Video Technology (TCSVT), IEEE Access, Digital Signal Processing

Contact

Contact me by email at

mehrlich {at} nvidia {dot} com

or stop by my office: IRB 4248

                  -----BEGIN PGP PUBLIC KEY BLOCK-----

                  xjMEY0WaBxYJKwYBBAHaRw8BAQdAtvqQTAbvje5tZNvm+oPAwn5nXqFHmo9g
                  yqYE/b4dV9LNN3F1ZXVlY3VtYmVyQHByb3Rvbm1haWwuY29tIDxxdWV1ZWN1
                  bWJlckBwcm90b25tYWlsLmNvbT7CjAQQFgoAHQUCY0WaBwQLCQcIAxUICgQW
                  AAIBAhkBAhsDAh4BACEJECvGNJpqUJGvFiEErWWIAZf4JoLBVB79K8Y0mmpQ
                  ka/QAgD+KG1R6ELfNmn1MbD2slNBcBO/R/ZQEfpTTVEsnccRQFcA/j12pH2Z
                  Ys34JfbZL9k2r/epWw2VgsqcsPY+Gls6eWwCzjgEY0WaBxIKKwYBBAGXVQEF
                  AQEHQI5GnYGeCOCFL2i8ZZXeuv5fuvK7oY9Jo9EQm9K7FAM3AwEIB8J4BBgW
                  CAAJBQJjRZoHAhsMACEJECvGNJpqUJGvFiEErWWIAZf4JoLBVB79K8Y0mmpQ
                  ka/gIAD/XpQGOrfg2DMl998n/Y8Ak6FVJrOga8rHLc/Y1y8R1rcA/1TA4MbR
                  wY69LacJYVo91FCZyYFhqswPDOxxOZtzqYUI
                  =52WD
                  -----END PGP PUBLIC KEY BLOCK-----
                

Students

My Research

My research emphasizes broad impact and collaboration with outside agencies. Aside from these research programs, I have participated in many other published research projects, please see my full list of papers and patents below for more information.

Video Compression

Video sharing is increasingly popular and quickly becoming the primary method for interaction on the internet. With the global pandemic, video conferencing has become mandatory for many people to work or attend school. This causes major problems for people who lack a broadband connection. In this ongoing paper series on video compression, I am developing ways to incorporate deep learning models which run on commodity hardware and can be used in the near term. This research is conducted in collaboration with NVIDIA.

JPEG Compression

JPEG compression is the most popular image compression algorithm and currently powers image sharing on the internet and mobile phones. In this paper series on JPEG compression, I advanced theoretical knowledge about the interaction between JPEG compression and deep learning and used these theoretical results to improve the fidelity of JPEG images both for human and machine consumption. This research was primarly funded by a three year academic grant awarded to me by Facebook (Meta) AI, allowing me to work autonomously, and led to collaborations with Facebook.

Remote Sensing

In this program, we developed novel methods for improving land cover segmentation in sattelite images. This is a challenging and important problem with wide application from national defense to planning and surveying. This research was funded by the IARPA Core3D program.

  Video Compression

Video sharing is increasingly popular and quickly becoming the primary method for interaction on the internet. With the globlal pandemic, video conferencing has become mandatory for many people to work or attend school. This causes major problems for people who lack a broadband connection. In this ongoing paper series on video compression, I am developing ways to incorporate deep learning models which run on commodity hardware and can be used in the near term. This research is conducted in collaboration with NVIDIA.

MetaBit: Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement Paper Cite It!- The current state of deep-learning for video compression is far behind classical compression. While deep-learning based codecs can generate beautiful images, they are slow and require significant hardware resources and software packages to run. In this work, we show that simply using commodity H.264 compression, the most popular video compression algorithm, along with a lightweight deep model for restoration, we can match or outperform fully deep-learning based codecs. Furthermore, we can leverage prior knowledge of classical video compression to make this process extremely efficient. This has the advantages of fast encoding with no custom hardware requirements and a fully decodable stream for consumers who lack the custom software package. This work was published in WACV 2024.
NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling Project Page Paper Cite It! - Implicit Neural Representations (INR) have recently shown to be powerful tool for high-quality video compression. However, existing works are limiting as they do not explicitly exploit the temporal redundancy in videos, leading to a long encoding time. Additionally, these methods have fixed architectures which do not scale to longer videos or higher resolutions. To address these issues, we propose NIRVANA, which treats videos as groups of frames and fits separate networks to each group performing patch-wise prediction. The video representation is modeled autoregressively, with networks fit on a current group initialized using weights from the previous group's model. This method was published in the 2023 British Machine Vision Conference.

  JPEG Compression

JPEG compression is the most popular image compression algorithm and currently powers image sharing on the internet and mobile phones. In this paper series on JPEG compression, I advanced theoretical knowledge about the interaction between JPEG compression and deep learning and used these theoretical results to improve the fidelity of JPEG images both for human and machine consumption. This research was primarly funded by a three year academic grant awarded to me by Facebook (Meta) AI, allowing me to work autonomously, and led to collaborations with Facebook.

CRAB: Compression Robustness Analysis Benchmark Paper Cite It! - The most comprehensive study of the effect of JPEG compression to date! The CRAB system allows for fast, easy, and consistent benchmarking of deep learning methods when their inputs are JPEG compressed, as well as how they behave under various JPEG mitigation techniques including a new one we developed that is entirely self-supervised. We used CRAB to benchmark 20 commonly used models across three tasks: classification, detection, and segmentation (instance and semantic). Stay tuned for the CRAB code release, which will allow researchers to benchmark their own models and submit the results for inclusion into the study as well as the study website detailing our findings. In the meantime, check out our preprint on arXiv (https://arxiv.org/abs/2011.08932) which contains details of the study as well as the complete results. This work was published in the MELEX workshop at the International Conference on Computer Vision.
Quantization Guided JPEG Artifact Correction Project Page Code Talk Slides Talk Video Paper Cite It! - We develop a novel method for JPEG artifact correction Wthat solves three major problems left open in prior works:
  1. Prior works train an ensemble of models, one for each JPEG quality. We use a single network parameterized by the JPEG quantization matrix.
  2. Prior works deal with grayscale images only, with the assumption that their models can be applied channel-wise. We show that single-channel networks have trouble generalizing and design a network for color correction.
  3. Prior works focus on CNN regression which causes blurry and textureless results. We introduce a novel GAN loss that includes an explicit texture restoring term, this yields a more realistic result.
Our method achieves state-of-the-art results on color artifact correction. The paper was published in the proceedings of the European Conference on Computer Vision. I strongly recommend reading the arXiv version which includes the appendices.
JPEG Domain Residual Networks Project Page Colab Code Poster Paper Cite It! - In this work we develop the popular Residual Network architecture in the JPEG domain. Our goal is to produce a formulation which gives a result that is as close as possible to the spatial domain network, but which can operate on compressed JPEG images. Our formulation is generic and has applicability outside of classification objective that we show as an example. We show a notable performance increase by processing in the JPEG domain. This work was funded by Facebook and published in the proceedings of the International Conference on Computer Vision 2019.

  Remote Sensing

In this program, we developed novel methods for improving land cover segmentation in sattelite images. This is a challenging and important problem with wide application from national defense to planning and surveying. This research was funded by the IARPA Core3D program.

Unsupervised Super-Resolution of Satellite Imagery for High Fidelity Material Label Transfer. Paper Cite It! - One major outstanding problem for training deep networks for ground material segmentation is the lack of large amounts of high quality data. In this paper we present a method for super-resolving low resolution sattelite images in a way which preserves important properties of their semantic material labels. This allows low-resolution images, which are plentiful, to be mixed in with high-resolution images to improve the performance of existing ground material segmentation methods. This paper was published in the Proceedings of the International Geoscience and Remote Sensing Symposium.
Stacked U-Nets for Ground Material Segmentation in Remote Sensing Imagery. Paper Cite It! - We develop a novel method for ground material segmentation from satellite images. This method relies on the recent Stacked, Dilated U-Nets method which has good results and is efficient on the large images that remote sensing generates. We also propose a novel frequency-domain post processing which reduces spurrious artifacts generated by the deep model. Our method won 3rd place in the 2018 CVPR DeepGlobe challenge and was published in the CVPR proceedings.

Full List of Papers and Patents

Google Scholar dblp

Total Publications: Total Citations: h-index:

Loading citation data from Google Scholar

2024

Latent-INR: A Flexible Framework for Implicit Representations of Videos with Discriminative Semantics
Shishira R Maiya, Anubhav Gupta, Matthew Gwilliam,Max Ehrlich, Abhinav Shrivastava
In ECCV
arXiv Cite It!
Wolf: Captioning Everything with a World Summarization Framework
Boyi Li, Ligeng Zhu, Ran Tian, Shuhan Tan, Yuxiao Chen, Yao Lu, Yin Cui, Sushant Veer, Max Ehrlich et al.
arXiv Preprint
arXiv Cite It!
Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions
Namitha Padmanabhan, Matthew Gwilliam, Pulkit Kumar, Shishira R Maiya, Max Ehrlich, Abhinav Shrivastava
In CVPR
arXiv Cite It!
Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement
Max Ehrlich, Jon Barker, Namitha Padmanabhan, Larry S. Davis, Andrew Tao, Bryan Catanzaro, Abhinav Shrivastava
In WACV
arXiv Cite It!

2023

Unifying the Harmonic Analysis of Adversarial Attacks and Robustness
Shishira R. Maiya, Max Ehrlich, Vatsal Agarwal, Ser-Nam Lim, Tom Goldstein, Abhinav Shrivastava
In BMVC
arXiv Cite It!
NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling
Shishira R Maiya*, Sharath Girish*, Max Ehrlich, Hanyu Wang, Kwot Sin Lee, Patrick Poirson, Pengxiang Wu, Chen Wang, Abhinav Shrivastava
In CVPR
arXiv Cite It!

2022

ReLaX: Retinal Layer Attribution for Guided Explanations of Automated Optical Coherence Tomography Classification
Evan Wen, Rebecca Sorenson, Max Ehrlich
In ECCV Medical Computer Vision Workshop
arXiv Cite It!
The First Principles of Deep Learning and Compression
Max Ehrlich
Doctoral Dissertation, University of Maryland College Park
arXiv Cite It!

2021

Analysing and Mitigating JPEG Compression Defects in Deep Learning
Max Ehrlich, Larry Davis, Ser-Nam Lim, Abhinav Shrivastava
In ICCV MELEX Workshop
arXiv CVF Cite It!

2020

Quantization Guided JPEG Artifact Correction
Max Ehrlich, Larry Davis, Ser-Nam Lim, Abhinav Shrivastava
In ECCV
arXiv ECVA Cite It!

2019

Unsupervised Super-Resolution of Satellite Imagery for High Fidelity Material Label Transfer
Arthita Ghosh, Max Ehrlich, Larry Davis, Rama Chellappa
In IGARS
arXiv IEEE Cite It!
Deep Residual Learning in the JPEG Transform Domain
Max Ehrlich and Larry S. Davis
In ICCV
arXiv CVF Cite It!
Deep Multi-Task Representation Learning.
Mohamed R. Amer, Timothy J. Shields, Amir Tamrakar, Max Ehrlich, Timur Almaev
U.S. Patent Application 16/085,859
Google Cite It!

2018

Stacked U-Nets for Ground Material Segmentation in Remote Sensing Imagery.
Arthita Ghosh, Max Ehrlich, Sohil Shah, Larry Davis, Rama Chellappa
In CVPR Workshops
CVF Cite It!

2017

Action-Affect-Gender Classification using Multi-Task Representation Learning
Timothy J. Shields, Mohamed R. Amer, Max Ehrlich, Amir Tamrakar
In CVPR Workshops
CVF Cite It!

2016

Discriminative Hand Localization in Depth Images.
Max Ehrlich and Philippos Mordohai
In 3DUI
IEEE Direct Cite It!
Facial Attributes Classification using Multi-Task Representation Learning.
Max Ehrlich, Timothy J. Shields, Timur Almaev, Mohamed R. Amer
In CVPR Workshops
CVF Cite It!

2015

Discriminative Hand Tracking from Depth Images.
Max Ehrlich
Master's Thesis, Stevens Institute of Technology
Direct