About me

PhD student at the Institute of Mathematics, Polish Academy of Sciences advised by Piotr Miłoś. I’m interested in many aspects of LLMs, including: long context, math reasoning, code generation, LLM efficiency. Previously Student Researcher at Google, very fortunate to be mentored by Yuhuai Wu and Christian Szegedy. I am also grateful to Henryk Michalewski and Łukasz Kaiser for supervising my bachelor thesis.

My recent work, Focused Transformer (LongLLaMA), develops a method for extending context length of existing LLMs like LLaMA, in an efficient way. I also published papers on using language models for automated theorem proving (formal mathematics).

My dream goal is to build more and more autonomous large language models, capable of assisting humans in solving difficult research-level problems, and ultimately even generating new scientific knowledge automatically.

I am very excited about learning foreign languages, most recently Mandarin - 叫我世梦就行。

Publication

Focused Transformer: Contrastive Training for Context Scaling Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś

NeurIPS 2023

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik

NeurIPS 2022

Hierarchical Transformers Are More Efficient Language Models Piotr Nawrot, Szymon Tworkowski, Michał Tyrolski, Łukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski

NAACL 2022, Findings

Formal Premise Selection With Language Models Szymon Tworkowski, Maciej Mikuła, Tomasz Odrzygóźdź, Konrad Czechowski, Szymon Antoniak*, Albert Q. Jiang, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu

AITP 2022

Magnushammer: A Transformer-based Approach to Premise Selection Maciej Mikuła, Szymon Antoniak, Szymon Tworkowski*, Albert Qiaochu Jiang, Jin Peng Zhou, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu

arXiv’23

Explaining Competitive-Level Programming Solutions using LLMs Jierui Li, Szymon Tworkowski, Yingying Wu, Raymond Mooney

ACL2023 NLRSE workshop

Invited talks

Google DeepMind, Zürich - How to Make LLMs Utilize Long Context Efficiently? Oct 9, 2023
University of Edinburgh, Edinburgh NLP Meeting - How to Make LLMs Utilize Long Context Efficiently? Oct 2, 2023
ACL 2023 - Explaining Competitive-Level Programming Solutions using LLMs - interview with Letitia from AI Coffee Break
AITP 2022 - Formal Premise Selection With Language Models (recording on YT)