cv
See pdf
Basics
Name | Lei Huang |
leih5@illinois.edu |
Education
-
2025.08 - 2027.05 -
2021.09 - 2025.07
Work
-
2025.08 - Present Research Assistant
University of Illinois Urbana-Champaign
Scaling αβ-CROWN toward parallel, high-performance, distributed, and scalable formal verification; applying HPC techniques in Python/PyTorch.
- Parallelization and performance engineering of αβ-CROWN
- Leveraged HPC strategies for PyTorch-based verifier workloads
-
2025.07 - 2025.08 ML System Engineer
Research Startup
Built end-to-end LLM pretraining stack: data production, distributed training, and evaluation.
- Designed a data production cluster capable of ~10M tokens/sec
- Set up NVIDIA Megatron-based distributed training
- Produced ~1T tokens and trained an 890M model on 16×H100 in ~14 days
-
2024.12 - 2025.07 Quant Systems Engineer
Sixie Capital
Supported quant research with GPU/CUDA/PyTorch; built distributed ML and HPC storage infra.
- Implemented multinode multi-GPU training with one-click orchestration
- Automated BeeGFS over RDMA InfiniBand deployment for HPC storage
- Shipped a meta-rule-based market data check system preventing multiple incidents
-
2023.12 - 2024.12 Research Assistant
National University of Singapore (NUS)
Developed CUDA implementations of algorithms with large speedups.
- Delivered CUDA kernels/implementations achieving ~1000× speedups over CPU
Awards
- 2023.05
ISC'23 Student Cluster Competition — Third Place
International Supercomputing Conference (ISC)
Compiled, optimized, and analyzed fluid simulation workloads on FAU & Bridges-2 supercomputers.
- 2023.11
SC'23 Student Cluster Competition — Seventh Place & Outstanding Reproducibility Report
Supercomputing Conference (SC)
48-hour continuous run analyzing large-scale matrix decompositions; reproduced key results.
- 2021.11
ICPC Asia Regional — 3× Silver Medals
ICPC
Team captain and core member; solved 7 problems within 5-hour contests.
Publications
-
2025.02.01 Verification of Bit-Flip Attacks against Quantized Neural Networks
OOPSLA 2025 (CCF-A)
verification of bit-flip attacks on QNNs.
Projects
-
BFAVerifier — CUDA-Accelerated Formal Verifier for Bit-Flip Attacks
- C++, CUDA, Gurobi
- Implemented SymPoly verification algorithm on GPU
-
LBM — Fluid Simulation Optimized for Microarchitectural Features
- C, OpenMP, SSE2, AVX2
- Microarchitecture-aware and low-level performance engineering
Languages
Chinese | |
Native |
English | |
Fluent |
Skills
Programming | |
Modern C/C++ | |
Modern Python/PyTorch | |
CUDA *Not limited to any Langauge |
Systems | |
High Performance Computing | |
(Linux) System Programming | |
Distributed/Parallel Programming |
Speaking | |
Chinese (Native) | |
English (Fluent) |