Tianyu Cao

Tianyu Cao

MS student of MIIS@CMU

Carnegie Mellon University


My name is Tianyu Cao(曹田雨). I'm currently a graduate student at Carnegie Mellon University, pursuing the Master of Science in Intelligent Information Systems (MIIS) degree of the Language Technologies Institute (LTI) in the School of Computer Science (SCS). My research interests lie in natural language processing, especially for evaluation of large language models (LLM).

Previously, I have obtained my Bachelor's degree in Computer Science and Technology from Chu Kochen Honors College, Zhejiang University, supervised by Prof. Yang Yang. I also had some wonderful research experience in CHAI lab with Prof. Chenhao Tan at UChicago.

I'm looking for machine learning intern at 2025 summer!

  • Natural Language Processing
  • Large Language Models Evaluation
  • Human-centered AI
  • MS in MIIS@SCS, 2024 - Present

    Carnegie Mellon University

  • Undergraduate in Computer Science and Technology, 2020 - 2024

    Zhejiang University


  • 2024.07: 🎉🎉 One paper about multimodal long-form summarization was accepted by COLM 2024. My first-author paper!
  • 2024.06: 🎓 I graduated as the outstanding graduates from CKC Honors College, Zhejiang University.
  • 2024.02: 🎉🎉 I was admitted to MIIS@CMU for Fall 2024.


Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports
In the 1st Conference on Language Modeling (COLM 2024)

Research and Experience

Research Intern
Advisor: Chenhao Tan, University of Chicago
Jul 2023 – Apr 2024 Chicago

Characterizing Multimodal Long-form Summarization by LLMs.

  • Developed an evaluation framework for multimodal long-form summarization on both numerical values and sentences.
  • Examined the extractiveness of the summary and assessed the distribution of source content within the summary.
  • Evaluated numerical hallucinations using a novel human evaluation protocol within the context of multimodal-origin.
Research Assistant
Advisor: Yang Yang, Zhejiang University
Sep 2022 – Sept 2023 Hangzhou

Context-aware Consistency Learning Framework for Segmented Time Series Classification.

  • Learn about Multiple classes with Varying Duration (MVD) raw time series data for segmented TSC.
  • Introduce contextual prior knowledge of data locality and label coherence to guide the model to focus on contextual information more conducive.
  • Propose a label consistency learning framework Con4m is proposed to progressively harmonize inconsistent labels during training.
Research Training Project
Advisor: Daoxin Dai, Zhejiang University
Mar 2022 – Dec 2022 Hangzhou

Research on neural network-based reverse design of silicon-based passive optical devices

  • Forward Neural Network (FNN) to replace traditional EMF simulation
  • Tandem network (TandemNet) by combining forward and inverse model in a tandem structure to deal with the non-unique problem of inverse design and solve black box problems.


Outstanding Graduates of Zhejiang University
First-prize Scholarship of Zhejiang University
Leading Scholarship of Chu Kochen Honors College
Five-star Outstanding Volunteer



Python, C/C++, Java, HTML




Volleyball, Swimming, Movies
