[Update] Automated update on 2024-09-18 #13

Furyton · 2024-09-18T11:12:09Z

Retrieved 50 papers from Scholar Inbox

In-Context Learning

"Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers",2024-09-09,"https://arxiv.org/pdf/2409.10559",Siyu Chen; Heejune Sheen; Tianhao Wang; Zhuoran Yang

Other Phenomena / Discoveries

"Norm of Mean Contextualized Embeddings Determines their Variance",2024-09-17,"https://arxiv.org/pdf/2409.11253",Hiroaki Yamagiwa; Hidetoshi Shimodaira

Knowledge / Memory Mechanisms

"Self-Attention Limits Working Memory Capacity of Transformer-Based Models",2024-09-16,"https://arxiv.org/pdf/2409.10715",Dongyu Gong; Hantao Zhang

What Can Transformer Do? / Properties of Transformer

"Adaptive Large Language Models By Layerwise Attention Shortcuts",2024-09-17,"https://arxiv.org/pdf/2409.10870",Prateek Verma; Mert Pilanci

What Can Transformer Not Do? / Limitation of Transformer

"Self-Attention Limits Working Memory Capacity of Transformer-Based Models",2024-09-16,"https://arxiv.org/pdf/2409.10715",Dongyu Gong; Hantao Zhang

All Digest Papers From Scholar Inbox

"Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers",2024-09-09,"https://arxiv.org/pdf/2409.10559",Siyu Chen; Heejune Sheen; Tianhao Wang; Zhuoran Yang
"Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models",2024-09-17,"https://arxiv.org/pdf/2409.11136",Orion Weller; Benjamin Van Durme; Dawn Lawrie; Ashwin Paranjape; Yuhao Zhang; Jack Hessel
"Semformer: Transformer Language Models with Semantic Planning",2024-09-17,"https://arxiv.org/pdf/2409.11143",Yongjing Yin; Junran Ding; Kai Song; Yue Zhang
"Linear Recency Bias During Training Improves Transformers Fit to Reading Times",2024-09-17,"https://arxiv.org/pdf/2409.11250",Christian Clark; Byung-Doh Oh; William Schuler
"Norm of Mean Contextualized Embeddings Determines their Variance",2024-09-17,"https://arxiv.org/pdf/2409.11253",Hiroaki Yamagiwa; Hidetoshi Shimodaira
"Kolmogorov-Arnold Transformer",2024-09-16,"https://arxiv.org/pdf/2409.10594",Xingyi Yang; Xinchao Wang
"Adaptive Large Language Models By Layerwise Attention Shortcuts",2024-09-17,"https://arxiv.org/pdf/2409.10870",Prateek Verma; Mert Pilanci
"Propulsion: Steering LLM with Tiny Fine-Tuning",2024-09-17,"https://arxiv.org/pdf/2409.10927",Md Kowsher; Nusrat Jahan Prottasha; Prakash Bhat
"Improving the Efficiency of Visually Augmented Language Models",2024-09-17,"https://arxiv.org/pdf/2409.11148",Paula Ontalvilla; Aitor Ormazabal; Gorka Azkune
"Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style",2024-09-17,"https://arxiv.org/pdf/2409.10955",Yuepei Li; Kang Zhou; Qiao Qiao; Bach Nguyen; Qing Wang; Qi Li
"Self-Attention Limits Working Memory Capacity of Transformer-Based Models",2024-09-16,"https://arxiv.org/pdf/2409.10715",Dongyu Gong; Hantao Zhang
"SOAP: Improving and Stabilizing Shampoo using Adam",2024-09-17,"https://arxiv.org/pdf/2409.11321",Nikhil Vyas; Depen Morwani; Rosie Zhao; Itai Shapira; David Brandfonbrener; Lucas Janson; Sham Kakade
"CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios",2024-09-16,"https://arxiv.org/pdf/2409.10593",Luning Wang; Shiyao Li; Xuefei Ning; Zhihang Yuan; Shengen Yan; Guohao Dai; Yu Wang
"Convolutional Networks as Extremely Small Foundation Models: Visual Prompting and Theoretical Perspective",2024-09-03,"https://arxiv.org/pdf/2409.10555",Jianqiao Wangni
"A close pair of orbiters embedded in a gaseous disk: the repulsive effect",2024-09-16,"https://arxiv.org/pdf/2409.10751",F. J. Sanchez-Salcedo; F. S. Masset; S. Cornejo
"KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models",2024-09-17,"https://arxiv.org/pdf/2409.11057",Bo Lv; Quan Zhou; Xuanang Ding; Yan Wang; Zeming Ma
"Protecting Copyright of Medical Pre-trained Language Models: Training-Free Backdoor Watermarking",2024-09-14,"https://arxiv.org/pdf/2409.10570",Cong Kong; Rui Xu; Weixi Chen; Jiawei Chen; Zhaoxia Yin
"ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihood",2024-09-14,"https://arxiv.org/pdf/2409.10571",Ruoyu Wang; Jiachen Sun; Shaowei Hua; Quan Fang
"Improving Multi-candidate Speculative Decoding",2024-09-16,"https://arxiv.org/pdf/2409.10644",Xiaofan Lu; Yixiao Zeng; Feiyang Ma; Zixu Yu; Marco Levorato
"Selective algorithm processing of subset sum distributions",2024-09-17,"https://arxiv.org/pdf/2409.11076",Nick Dawes
"Reasoning Graph Enhanced Exemplars Retrieval for In-Context Learning",2024-09-17,"https://arxiv.org/pdf/2409.11147",Yukang Lin; Bingchen Zhong; Shuoran Jiang; Joanna Siebert; Qingcai Chen
"Fairness in Survival Analysis with Distributionally Robust Optimization",2024-08-31,"https://arxiv.org/pdf/2409.10538",Shu Hu; George H. Chen
"Implicit Reasoning in Deep Time Series Forecasting",2024-09-17,"https://arxiv.org/pdf/2409.10840",Willa Potosnak; Cristian Challu; Mononito Goswami; Michał Wiliński; Nina Żukowska
"Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering",2024-09-17,"https://arxiv.org/pdf/2409.10790",Qingru Zhang; Xiaodong Yu; Chandan Singh; Xiaodong Liu; Liyuan Liu; Jianfeng Gao; Tuo Zhao; Dan Roth; Hao Cheng
"Query Learning of Advice and Nominal Automata",2024-09-17,"https://arxiv.org/pdf/2409.10822",Kevin Zhou
"From Latent to Engine Manifolds: Analyzing ImageBinds Multimodal Embedding Space",2024-08-30,"https://arxiv.org/pdf/2409.10528",Andrew Hamara; Pablo Rivas
"THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models",2024-09-17,"https://arxiv.org/pdf/2409.11353",Mengfei Liang; Archish Arun; Zekun Wu; Cristian Munoz; Jonathan Lutch; Emre Kazim; Adriano Koshiyama; Philip Treleaven
"Says Who? Effective Zero-Shot Annotation of Focalization",2024-09-17,"https://arxiv.org/pdf/2409.11390",Rebecca M. M. Hicke; Yuri Bizzoni; Pascale Feldkamp; Ross Deans Kristensen-McLachlan
"Evaluating the Impact of Compression Techniques on Task-Specific Performance of Large Language Models",2024-09-17,"https://arxiv.org/pdf/2409.11233",Bishwash Khanal; Jeffery M. Capone
"Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T5",2024-09-17,"https://arxiv.org/pdf/2409.11282",Marcel Lamott; Muhammad Armaghan Shakir
"Clustering with Non-adaptive Subset Queries",2024-09-17,"https://arxiv.org/pdf/2409.10908",Hadley Black; Euiwoong Lee; Arya Mazumdar; Barna Saha
"A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B",2024-09-17,"https://arxiv.org/pdf/2409.11055",Jemin Lee; Sihyeong Park; Jinse Kwon; Jihun Oh; Yongin Kwon
"Communication Lower Bounds and Optimal Algorithms for Symmetric Matrix Computations",2024-09-17,"https://arxiv.org/pdf/2409.11304",Hussam Al Daas; Grey Ballard; Laura Grigori; Suraj Kumar; Kathryn Rouse; Mathieu Verite
"A Best-of-Both Approach to Improve Match Predictions and Reciprocal Recommendations for Job Search",2024-09-17,"https://arxiv.org/pdf/2409.10992",Shuhei Goda; Yudai Hayashi; Yuta Saito
"Generalized Measures of Anticipation and Responsivity in Online Language Processing",2024-09-16,"https://arxiv.org/pdf/2409.10728",Mario Giulianelli; Andreas Opedal; Ryan Cotterell
"Relative Representations: Topological and Geometric Perspectives",2024-09-17,"https://arxiv.org/pdf/2409.10967",Alejandro García-Castellanos; Giovanni Luca Marchetti; Danica Kragic; Martina Scolamiero
"RoMath: A Mathematical Reasoning Benchmark in Romanian",2024-09-17,"https://arxiv.org/pdf/2409.11074",Adrian Cosma; Ana-Maria Bucur; Emilian Radoi
"The Complexity of Maximizing the MST-ratio",2024-09-17,"https://arxiv.org/pdf/2409.11079",Afrouz Jabal Ameli; Faezeh Motiei; Morteza Saghafian
"Boolean Functions with Small Approximate Spectral Norm",2024-09-16,"https://arxiv.org/pdf/2409.10634",Tsun-Ming Cheung; Hamed Hatami; Rosie Zhao; Itai Zilberstein
"Physics-Informed Neural Networks with Trust-Region Sequential Quadratic Programming",2024-09-17,"https://arxiv.org/pdf/2409.10777",Xiaoran Cheng; Sen Na
"DeFi Arbitrage in Hedged Liquidity Tokens",2024-09-17,"https://arxiv.org/pdf/2409.11339",Maxim Bichuch; Zachary Feinstein
"Tight Lower Bounds under Asymmetric High-Order Hölder Smoothness and Uniform Convexity",2024-09-17,"https://arxiv.org/pdf/2409.10773",Site Bai; Brian Bullins
"GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval",2024-09-17,"https://arxiv.org/pdf/2409.10909",Wonduk Seo; Haojie Zhang; Yueyang Zhang; Changhao Zhang; Songyao Duan; Lixin Su; Daiting Shi; Jiashu Zhao; Dawei Yin
"CLIP Adaptation by Intra-modal Overlap Reduction",2024-09-17,"https://arxiv.org/pdf/2409.11338",Alexey Kravets; Vinay Namboodiri
"Trajectory-Oriented Control Using Gradient Descent: An Unconventional Approach",2024-09-16,"https://arxiv.org/pdf/2409.10662",Ramin Esmzad; Hamidreza Modares
"Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities",2024-09-15,"https://arxiv.org/pdf/2409.10574",Md Tauseef Alam; Raju Halder; Abyayananda Maiti
"Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models",2024-09-17,"https://arxiv.org/pdf/2409.10999",Potsawee Manakul; Guangzhi Sun; Warit Sirichotedumrong; Kasima Tharnpipitchai; Kunat Pipatanakul
"On the number of prime factors with a given multiplicity over h-free and h-full numbers",2024-09-17,"https://arxiv.org/pdf/2409.11275",Sourabhashis Das; Wentang Kuo; Yu-Ru Liu
"Elementary symmetric partitions",2024-09-17,"https://arxiv.org/pdf/2409.11268",Cristina Ballantine; George Beck; Mircea Merca; Bruce Sagan
"Online Combinatorial Allocations and Auctions with Few Samples",2024-09-17,"https://arxiv.org/pdf/2409.11091",Paul Dütting; Thomas Kesselheim; Brendan Lucier; Rebecca Reiffenhäuser; Sahil Singla

Automated update on 2024-09-18

4275cf4

Furyton closed this Oct 11, 2024

Furyton deleted the automated-update-1726657927 branch October 11, 2024 11:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Update] Automated update on 2024-09-18 #13

[Update] Automated update on 2024-09-18 #13

Furyton commented Sep 18, 2024

[Update] Automated update on 2024-09-18 #13

[Update] Automated update on 2024-09-18 #13

Conversation

Furyton commented Sep 18, 2024

In-Context Learning

Other Phenomena / Discoveries

Knowledge / Memory Mechanisms

What Can Transformer Do? / Properties of Transformer

What Can Transformer Not Do? / Limitation of Transformer

All Digest Papers From Scholar Inbox