About me
PhD student at the Institute of Mathematics, Polish Academy of Sciences advised by Piotr Miłoś. I’m interested in many aspects of LLMs, including: long context, math reasoning, code generation, LLM efficiency. Previously Student Researcher at Google, very fortunate to be mentored by Yuhuai Wu and Christian Szegedy. I am also grateful to Henryk Michalewski and Łukasz Kaiser for supervising my bachelor thesis.
My recent work, Focused Transformer (LongLLaMA), develops a method for extending context length of existing LLMs like LLaMA, in an efficient way. I also published papers on using language models for automated theorem proving (formal mathematics).
My dream goal is to build more and more autonomous large language models, capable of assisting humans in solving difficult research-level problems, and ultimately even generating new scientific knowledge automatically.
I am very excited about learning foreign languages, most recently Mandarin - 叫我世梦就行。
Publication
Focused Transformer: Contrastive Training for Context Scaling Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś
Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik
Hierarchical Transformers Are More Efficient Language Models Piotr Nawrot, Szymon Tworkowski, Michał Tyrolski, Łukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski
Formal Premise Selection With Language Models Szymon Tworkowski, Maciej Mikuła, Tomasz Odrzygóźdź, Konrad Czechowski, Szymon Antoniak*, Albert Q. Jiang, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu
Magnushammer: A Transformer-based Approach to Premise Selection Maciej Mikuła, Szymon Antoniak, Szymon Tworkowski*, Albert Qiaochu Jiang, Jin Peng Zhou, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu
Explaining Competitive-Level Programming Solutions using LLMs Jierui Li, Szymon Tworkowski, Yingying Wu, Raymond Mooney
Invited talks
Google DeepMind, Zürich - How to Make LLMs Utilize Long Context Efficiently? Oct 9, 2023
University of Edinburgh, Edinburgh NLP Meeting - How to Make LLMs Utilize Long Context Efficiently? Oct 2, 2023
AITP 2022 - Formal Premise Selection With Language Models (recording on YT)