Joonghyuk Shin

You can also call me Alex / 신중혁

joonghyuk AT snu.ac.kr

Joonghyuk Shin

Hi! I am a Ph.D. student at SNU, advised by Professor Jaesik Park. I am interested in building fast and interactive generative models that can precisely simulate the dynamic world or serve as an interactive medium for creative uses. Currently, I am working closely with Xun Huang on video world models (i.e., causal video diffusion).

News

2026/01
Two papers (MotionStream and DRPose) are accepted to ICLR 2026!
2025/12
I will be working at a stealth startup from Feb 2026 and at NVIDIA Spatial Intelligence Lab from Summer 2026, both on improving video world models.
2025/11
MotionStream is on arXiv. Gave talks at Pika AI, Daydream AI, and a stealth startup.
2025/06
A paper about text-based image editing is accepted in ICCV 2025.
2024/10
I will be joining Adobe Research in San Francisco as an intern on Eli Shechtman's team in Summer 2025, followed by a research visit to CMU's Robotics Institute in Fall 2025, hosted by Jun-Yan Zhu.

Education

SNU

Seoul National University (SNU)
Sep. 2023 - Present, CSE, Integrated M.S. and Ph.D. (Adviser: Jaesik Park)

POSTECH

Pohang University of Science and Technology (POSTECH)
Feb. 2019 - Feb. 2023, CSE, B.S. (Summa Cum Laude)

Experience

NVIDIA

NVIDIA (Spatial Intelligence Lab)
June 2026 - Dec. 2026, Research Scientist Intern / Santa Clara, CA (Planned)

Stealth Startup

Stealth Startup
Feb. 2026 - June 2026, Research Scientist Intern / Remote from Seoul
Working with Xun Huang on video world models

CMU

Carnegie Mellon University (Robotics Institute)
Oct. 2025 - Jan. 2026, Visiting Researcher (Jun-Yan's Lab) / Pittsburgh, PA
Worked with Ruihan Gao, Ava Pun, Wenzhen Yuan, and Jun-Yan Zhu on generative AI for accessibility

Adobe

Adobe Research
June 2025 - Sep. 2025, Research Scientist Intern / San Francisco, CA
Worked with Xun Huang, Zhengqi Li, Richard Zhang, Jun-Yan Zhu, and Eli Shechtman on building fast and interactive video generative models (MotionStream)

Publications

* Equal contribution, † Equal advising
DRPOSE

Direct Reward Fine-Tuning on Poses for Single Image to 3D Human in the Wild
Seunguk Do, Minwoo Huh, Joonghyuk Shin, Jaesik Park
ICLR 2026 - [Coming soon | ] We introduce DRPOSE, a direct reward fine-tuning method that improves pose accuracy in single-view 3D human reconstruction, especially for dynamic and acrobatic poses, without requiring expensive 3D human assets.

MotionStream: Real-Time Video Generation with Interactive Motion Controls
Joonghyuk Shin, Zhengqi Li, Richard Zhang, Jun-Yan Zhu, Jaesik Park, Eli Shechtman, Xun Huang
ICLR 2026 - [Paper | Project Page | Code | ] MotionStream is a streaming (causal, real-time, and long-duration) video generation system with motion controls, operating at ~30 FPS on a single H100 GPU, unlocking new possibilities for interactive content generation.

JAM-Flow

JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching
Mingi Kwon*, Joonghyuk Shin*, Jaeseok Jung, Jaesik Park†, Youngjung Uh
arXiv 2025 - [Paper | Project Page | ] We present a unified framework that jointly generates synchronized facial motion and speech using flow matching and MM-DiT, enabling diverse audio-visual synthesis tasks within a single model.

Exploring MM-DiT

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
Joonghyuk Shin, Alchan Hwang, Yujin Kim, Daneul Kim, Jaesik Park
ICCV 2025 - [Paper | Project Page | Code | ] We perform a systematic analysis of MM-DiT's bidirectional attention mechanism and introduce a robust prompt-based editing method working across diverse MM-DiT models (SD3 series and Flux).

InstantDrag

InstantDrag: Improving Interactivity in Drag-based Image Editing
Joonghyuk Shin, Daehyeon Choi, Jaesik Park
SIGGRAPH Asia 2024 - [Paper | Project Page | Code (230+) | ] We present InstantDrag, an optimization-free pipeline for fast, interactive drag-based image editing that requires only an image and drag instruction as input, learning from real-world video datasets.

Fill-Up

Fill-Up: Balancing Long-Tailed Data with Generative Models
Joonghyuk Shin, Minguk Kang, Jaesik Park
arXiv 2023 - [Paper | Project Page | ] We propose a two-stage method for long-tailed (LT) recognition using textual-inverted tokens to synthesize images, achieving SOTA results on standard benchmarks when trained from scratch.

StudioGAN

StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis
Minguk Kang, Joonghyuk Shin, Jaesik Park
TPAMI 2023 - [Paper | Code (3500+) | ] We present StudioGAN, a comprehensive library for GANs that reproduces over 30 popular models, providing extensive benchmarks and a fair evaluation protocol for image synthesis tasks.

Personal

I am a big fan of baseball. I played for POSTECH baseball team (Tachyons) for 5 years, as a captain and a catcher.
I love animals. I live with a dog named Poby. I also like Pokemon, travelling, and FIFA video games.
Personal photos

Last updated on Jan, 2026 · with Face Looker