Bora Kargi

Research Engineer @ ELLIS Institute Tübingen (OpenEuroLLM)

prof_pic.jpg

Tübingen, Germany

kargibora gmail.com

bora.kargi tue.ellis.eu

I am a Research Engineer at the ELLIS Institute Tübingen, where I work on the OpenEuroLLM project, focusing on the evaluation of large language models.

I hold an MSc in Machine Learning from the University of Tübingen and a BSc in Computer Engineering from Middle East Technical University (METU). During my master’s, I worked as a research assistant (HiWi), contributing to Scholar Inbox — a paper-recommendation platform — in Prof. Andreas Geiger’s Autonomous Vision Group. For my thesis, supervised by Prof. Seong Joon Oh in the Scalable Trustworthy AI group, I studied a fragility of CLIP-based vision–language models — how they can be misled by plausible but incorrect details (“half-truths”).

My research interests are broad — more than any single topic, I enjoy picking up new concepts and reading widely across different fields. Recently, I have been especially drawn to interpretability and language diffusion models.

My selected publications appear below — see the publications page for the full list, or take a look at my CV.

news

Apr 16, 2026 Graduated from the University of Tübingen with distinction. 🎓
Feb 01, 2026 Started as a Research Engineer at the ELLIS Institute Tübingen.
Jan 30, 2026 Submitted my Master’s thesis.
Jul 25, 2025 My first main-author paper was accepted to BMVC 2025.
May 15, 2025 Our Scholar Inbox paper was accepted to ACL 2025 (System Demonstrations Track).
Apr 01, 2024 Started as a Student Assistant on Scholar Inbox.

selected publications

  1. Preprint
    conformal-elo.gif
    From Uncertain Judgments to Calibrated Rankings: Conformal Elo Estimation for LLM Evaluation
    Bora Kargi and David Salinas
    arXiv preprint arXiv:2606.13221, 2026
    Turns biased LLM-as-a-judge votes into calibrated model rankings with honest, distribution-free uncertainty bounds — at a fraction of the cost of human evaluation.
  2. Preprint
    half-truths.gif
    Half-Truths Break Similarity-Based Retrieval
    Bora Kargi, Arnas Uselis, and Seong Joon Oh
    arXiv preprint arXiv:2602.23906, 2026
    CLIP-style models often score a caption as more similar after a plausible but wrong detail is added; we fix this by supervising the individual entities and relations within each caption.