Greg Pauloski

Computer Scientist // Software Engineer

email | github | linkedin | scholar

Hello there! I am a fifth-year Ph.D. student in Computer Science at the University of Chicago interested in high-performance computing, distributed systems, and deep learning frameworks. I am a member of Globus Labs where I am co-advised by Ian Foster and Kyle Chard. I completed my Bachelors in Computer Science at the University of Texas at Austin and previously worked at Apple, Google, and the Texas Advanced Computing Center.

🎉 I am on the job market! Seeking full-time opportunities post-graduation (Spring/Summer 2025).

science RESEARCH link

chevron_right
Distributed Systems: Modern computational science experiments are increasingly written as a coupled set of many distinct software coordinated by a central workflow system. We are designing new programming models which decouple communication from application design to enable multiple data movement methods depending on where data are moved, what are moved, or when they are moved.
chevron_right
Scalable Deep Learning: We are exploring new techniques for improving deep learning training time and scalability by (1) exploiting scalable algorithms for second-order information approximation; (2) developing methods for adapting to different computer hardware by tuning computation and communication to maximize training speed; (3) exploring compression techniques to reduce communication overheads; and (4) enabling complex, hierarchical federated learning across diverse ecosystems of hardware.
chevron_right
AI for Science: We are (1) training large (billion+ parameter) transformer-based language models on broad scientific literature to automate knowledge extraction; (2) developing frameworks for coupling AI and simulations on exascale supercomputers; and (3) building innovative and large-scale solutions to scientific challenges in genome evolution, next-generation battery design, and carbon capture.

book DISSERTATIONS link

chevron_right
Accelerating Communications in High- Performance Scientific Workflows [Apr 2025] link
Abstract | Poster | Doctoral Dissertation (Work-in-Progress)
chevron_right
Scalable Deep Neural Network Training with Distributed K-FAC [Mar 2022] link
Abstract | PDF | Slides | Masters Thesis

engineering PROJECTS link

Check out all of my projects on GitHub.

chevron_right
ProxyStore: Pass-by-reference semantics for distributed Python applications [Code]
chevron_right
K-FAC: Distributed PyTorch K-FAC gradient preconditioner [Code]
chevron_right
TaPS: Benchmarking suite for distributed/parallel task executors [Code]
chevron_right
LLM Training: Tools and scripts for large language model training [Code]
chevron_right
Colmena: Steering large campaigns of simulations on HPC with AI [Code]
chevron_right
3pseatBot: A hobby Discord bot [Code]

star SELECTED PUBLICATIONS link

Ordered by most recent.

chevron_right
TaPS: A Performance Evaluation Suite for Task-based Execution Frameworks [Sep 2024]
J. Gregory Pauloski, Valerie Hayot-Sasson, Maxime Gonthier, Nathaniel Hudson, Haochen Pan, Sicheng Zhou, Ian Foster, Kyle Chard
eScience 2024 — Best Paper
TLDR | PDF | Website | Code | Slides | Publication | BibTex
chevron_right
Object Proxy Patterns for Accelerating Distributed Applications [Jul 2024]
J. Gregory Pauloski, Valerie Hayot-Sasson, Logan Ward, Alexander Brace, André Bauer, Kyle Chard, Ian Foster
arXiv Preprint
TLDR | PDF | Website | Code | Preprint | BibTex
chevron_right
Accelerating Communications in Federated Applications with Transparent Object Proxies [Nov 2023]
J. Gregory Pauloski, Valerie Hayot-Sasson, Logan Ward, Nathaniel Hudson, Charlie Sabino, Matt Baughman, Kyle Chard, Ian Foster
SC 2023
TLDR | PDF | Website | Code | Poster | Slides | Publication | BibTex
chevron_right
Deep Neural Network Training With Distributed K-FAC [Mar 2022]
J. Gregory Pauloski, Lei Huang, Weijia Xu, Kyle Chard, Ian Foster, Zhao Zhang
TPDS 2022
TLDR | PDF | Code | Publication | BibTex
chevron_right
KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks [Nov 2021]
J. Gregory Pauloski, Qi Huang, Lei Huang, Shivaram Venkataraman, Kyle Chard, Ian Foster, Zhao Zhang
SC 2021
TLDR | PDF | Code | Slides | Publication | BibTex
chevron_right
Convolutional Neural Network Training with Distributed K-FAC [Nov 2020]
J. Gregory Pauloski, Zhao Zhang, Lei Huang, Weijia Xu, Ian Foster
SC 2020
TLDR | PDF | Code | Slides | Publication | BibTex

article ALL PUBLICATIONS link

Ordered by most recent and grouped by topic. Bibtex file available for download here.

chevron_right
DISTRIBUTED SYSTEMS
Sep 2024 TaPS: A Performance Evaluation Suite for Task-based Execution Frameworks link
TLDR | PDF | Authors | Website | Code | Slides | Publication | BibTex | eScience 2024 — Best Paper
Jul 2024 Object Proxy Patterns for Accelerating Distributed Applications link
TLDR | PDF | Authors | Website | Code | Preprint | BibTex | arXiv Preprint
Nov 2023 Accelerating Communications in Federated Applications with Transparent Object Proxies link
TLDR | PDF | Authors | Website | Code | Poster | Slides | Publication | BibTex | SC 2023
chevron_right
SCALABLE DEEP LEARNING
Sep 2024 Flight: A FaaS-Based Framework for Complex and Hierarchical Federated Learning link
TLDR | PDF | Authors | Code | Preprint | BibTex | arXiv Preprint
Dec 2023 Trillion Parameter AI Serving Infrastructure for Scientific Discovery: A Survey and Vision link
TLDR | PDF | Authors | Publication | BibTex | BDCAT 2023
Mar 2022 Deep Neural Network Training With Distributed K-FAC link
TLDR | PDF | Authors | Code | Publication | BibTex | TPDS 2022
Nov 2021 KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks link
TLDR | PDF | Authors | Code | Slides | Publication | BibTex | SC 2021
Nov 2020 Convolutional Neural Network Training with Distributed K-FAC link
TLDR | PDF | Authors | Code | Slides | Publication | BibTex | SC 2020
May 2020 Efficient I/O for Neural Network Training with Compressed Data link
TLDR | PDF | Authors | Code | Publication | BibTex | IPDPS 2020
Dec 2019 Aggregating Local Storage for Scalable Deep Learning I/O link
TLDR | PDF | Authors | Code | Publication | BibTex | DLS 2019
chevron_right
AI FOR SCIENCE
Oct 2024 Employing Artificial Intelligence to Steer Exascale Workflows with Colmena link
TLDR | PDF | Authors | Website | Code | Publication | BibTex | IJHPCA 2024
Nov 2023 DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies link
TLDR | PDF | Authors | Website | Preprint | BibTex | arXiv Preprint
May 2023 The Diminishing Returns of Masked Language Models to Science link
TLDR | PDF | Authors | Website | Publication | BibTex | Findings of the Association for Computational Linguistics: ACL 2023
Mar 2023 Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources link
TLDR | PDF | Authors | Code | Publication | BibTex | HCW @ IPDPS 2023
Oct 2022 GenSLMs: Genome-scale Language Models Reveal SARS-CoV-2 Evolutionary Dynamics link
TLDR | PDF | Authors | Code | Publication | BibTex | IJHPCA — ACM Gordon Bell Special Prize for COVID-19 Research
Nov 2021 Colmena: Scalable Machine-Learning-Based Steering of Ensemble Simulations for High Performance Computing link
TLDR | PDF | Authors | Website | Code | Publication | BibTex | MLHPC @ SC 2021
Aug 2021 Models and Processes to Extract Drug-like Molecules From Natural Language Text link
TLDR | PDF | Authors | Publication | BibTex | Frontiers in Molecular Biosciences
Nov 2018 Glioma Segmentation and a Simple Accurate Model for Overall Survival Prediction link
TLDR | PDF | Authors | Publication | BibTex | BrainLes 2018

co_present PRESENTATIONS link

Ordered by most recent.

Sep 2024 TaPS: A Performance Evaluation Suite for Task-based Execution Frameworks link
Slides | IEEE International Conference on eScience (eScience)
Nov 2023 Accelerating Communications in Federated Applications with Transparent Object Proxies link
Slides | Supercomputing
Oct 2023 ProxyStore: Decoupling Control and Data Flow in Workflows link
Slides | Video | ParslFest
Apr 2023 Accelerating Communications in Federated Applications with Transparent Object Proxies link
Poster | Greater Chicago Area Systems Research Workshop (GCASR)
Sep 2022 ProxyStore: a Data Fabric for Parsl and FuncX link
Slides | Video | ParslFest
Nov 2021 KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks link
Slides | Supercomputing
Nov 2020 Convolutional Neural Network Training with Distributed K-FAC link
Slides | Supercomputing
Sep 2018 Optimizing Deep Learning Methods for Image Segmentation with Distributed Training link
Poster | TACC Symposium for Texas Researchers (TACCSTER)