Ole Jorgensen

About:

Hello! My name is Ole, and I work in technical AI safety and governance. I'm an incoming PhD Student at the University of Oxford, and part-time Research Scientist at the AI Security Institute (AISI) with the Science of Evals team. Before that, I was a research engineer with the Chem-Bio team at AISI. Before that, I completed an MSc in Artificial Intelligence from Imperial College London, and have received an MMath in Mathematics from the University of St Andrews.

I am always keen to talk to folks about AI safety. If you want to chat, just email me at o*surname*1417@gmail.com.

Recent Work:

Early Insights from Developing Question-Answer Evaluations for Frontier AI I wrote this blog post alongside Friederike Grosse-Holz! We share lots of in the weeds details we have learned from developing and conducting QA evaluations.

"Improving Activation Steering in Language Models with Mean-Centring", winner of the best paper award at Human-Centric Representation Learning at AAAI 2024. Ole Jorgensen, Dylan Cope, Nandi Schoots, Murray Shanahan. A follow up to work from my Dissertation, we develop a new method of activation steering, called mean-centring, which is more general than previous methods. We evaluate it in a variety of settings, including applying it to recent work on function vectors by Eric Todd et. al.