Zhairui Shen
Email: me@szr.hk | Phone: +1-267-626-5377 | Website: https://szr.hk
Education
Pennsylvania State University — University Park, PA, USA
2025.08 – present
Jiangsu University & Arcadia University — China & USA
2021.08 – 2025.05
Background
Languages / Frameworks: Python, Java, C++, Vue.js, Dart, JavaScript/TypeScript, PHP, SQL
LLM Engineering:
- Training: SFT, MoE, context optimization
- System Architecture: RAG, LangChain, agent workflow design, API integration
Software / Web Systems Development:
- Front-end: Vue.js (UI/UX), Flutter (Dart)
- Back-end: Node.js, PHP, MySQL; Redis session; JWT/SSO authentication; 2FA/MFA
- Network & System: Linux ops, HTTP routing & reverse proxy, CDN, Redis caching, WebSocket, resumable transfer, AES-256, Postfix/Dovecot (DKIM/SPF), SSL/TLS automation
Internship Experiences
Alibaba Network Technology Co., Ltd — Nanjing, China
LLM Optimization & Training Systems | 2024.07 – 2024.08
- Collected and curated a large-scale creative-text dataset; data cleaning, tokenization, normalization for SFT.
- Fine-tuned Qwen-2-7B for long-text generation; experimented with MoE to improve parameter efficiency.
- Redesigned pipeline with sharding, caching, FP16 mixed-precision, dynamic batching → +12% validation accuracy, –9% epoch time vs. baseline.
- Deployed the fine-tuned model via API-based inference with real-time generation and feedback logging.
Hikvision Digital Technology Co., Ltd — Hangzhou, China
Computer Vision Internship | 2025.06 – 2025.08
- Developed and tested real-time object detection using YOLOv10 + OpenCV; integrated camera streaming and on-device inference.
- Optimized speed and accuracy via GPU parallelization & data augmentation for challenging lighting/occlusion.
Research Projects
Multimodal LLM Research — Retrieval-Augmented Multimodal Systems & Evaluation
2025.09 – present
- Building a RAG-based multimodal pipeline on Qwen 2.5-VL-7B with a Northwestern research team for long-context, cross-modal use.
- Integrated verL (retrieval), LLaMA-Factory (SFT), and lmms-eval (benchmark) into a reproducible train→eval→analysis flow.
- Designed baselines/ablations (RAG vs. non-RAG) and tested context-optimization; supporting paper prep.
LLM & RAG for Security & Privacy
2024.12 – 2025.02
- Designed a RAG pipeline for mental-health app privacy policies with dense-vector retrieval + contextual re-ranking.
- Fine-tuned & evaluated multiple LLMs on curated datasets to improve compliance Q&A accuracy.
- Applied prompt-engineering & context-optimization to balance accuracy, latency, and cost; improved multilingual robustness.
- Results published at ASEE 2025 Annual Conference & Exposition.
Optimization for Text-to-Image & Text-to-Video Generation — AI Theoretical Methods
2024.08 – 2024.09
- Reproduced/extended Diffusion Transformer methods (Stable Diffusion baseline).
- Implemented adaptive multi-head attention and improved positional encoding + LoRA FP16 fine-tuning.
- Achieved +6.3% CLIP and –4.7 FID with mixed-precision & CUDA-level optimization.
- Abstract published in JCSC; poster presented at CCSC-E 2024.
Applied Projects
Cloud Platform / Ecosystem — https://szr.hk
Systems / Network & Full-Stack Development
- End-to-end platform (productivity, AI, auth, comms) design & deployment.
- Stack: Linux, Nginx (reverse proxy), TLS/SSL automation, CDN caching; Node.js/PHP/MySQL backend; Vue.js frontend.
- Redis for caching & sessions; JWT/SSO unified identity; RBAC; WebSocket; resumable transfer; AES-256.
- Modules:
- Office Suite & Cloud Drive (PHP/MySQL/Vue) across devices.
- AI Services: fine-tuned LLMs + PDF-OCR for image-embedded text extraction & analysis.
- Email Service: Postfix/Dovecot with SMTP/IMAP, DKIM/SPF; web & mobile clients.
- Unified Account: JWT/SSO + Redis, single sign-on across sub-platforms.
- Open-sourced distribution: GitHub Repository
Delivery App Development — GitHub
Software Engineering
- Flutter (Dart) cross-platform app with secure backend (Firebase Auth, Cloudflare DB, REST, Google Maps).
- Implemented RBAC, authentication, and robust data handling for drivers/admins/recipients; offline caching for unstable networks.
- Developed for the non-profit “Caring for Friends” to coordinate food delivery in Philadelphia.
C++ Game Development
Software Engineering / Game Programming
- C++/SDL2 implementation: two-player controls, animations, collision-based hit detection, dynamic HP bars.
- Event-driven game loop with state/timing control; OOP + efficient memory management for responsiveness.
Publications
- H. Sun, W. Tang, L. Deng, X. You, Z. Shen, X. Sun, J. Zou, F. Lin, Z. Qian, H. Liu, “Development and Validation of Interpretable Machine Learning Models Incorporating Paraspinal Muscle Quality to Predict Cage Subsidence Risk Following Posterior Lumbar Interbody Fusion”, Spine (Phila Pa 1976), vol. 50, no. 20, pp. 1375–1385, Oct 2025. doi: 10.1097/BRS.0000000000005388
- X. You, W. Wang, Z. Shen, Y. Jia, “From Data Trends to Privacy Insights in Mental Health Apps: an LLM-Powered Approach”, Proc. ASEE Annual Conference & Exposition, June 2025. doi: 10.18260/1-2-56606
- Z. Shen, T. Wang, V. Ford, “Exploring the Architecture and Application of Transformer Models in NLP and Media Generation”, JCSC, vol. 40, no. 3, pp. 47–48, Oct 2024; poster, CCSC-E 2024.
Honors and Awards
- Top 1% Team Recognition, National Cyber League (NCL) — Oct 2023
- Honorable Mention, Mathematical Contest in Modeling (MCM) — Feb 2024
- Top 5% Team Recognition, National Cyber League (NCL) — Oct 2024
- Top 10% Team Recognition, Computing Sciences in Colleges Programming Contest — Oct 2024
- Top 5% Team Recognition, International Collegiate Programming Contest (ICPC) — Nov 2024