LLM acceleration
Summary Methods There are two main methods to acclerate LLM low-rank: reduce dimension of matrix block: compute matrix with block and another tricky methods already read papers: 9 Refere...
Summary Methods There are two main methods to acclerate LLM low-rank: reduce dimension of matrix block: compute matrix with block and another tricky methods already read papers: 9 Refere...
Collect, summary and adjust to get the following tutorial from multi-sources in Reference. How to learn Very quickly identify what the foundational knowledge is Build a personal curriculum to...
Product LLM acceleration Almost finish it xiaohongshu start the small business Reading lazy-leadership llm-server ShortMax: fill the space between TikTok and film. HARO shut down: ty...
Product LocalPictureCompress Spent whole one day to build LocalPictureCompress, really enjoy the monent when I publish it. Try AI code assistants continue: open source product, supports OpenA...
Product New ideas computer use by local models polish anything by local models YouTube Upload five videos this week Reading Google build-AI challenge OpenAI ask me anything anthro...
Product LLM acceleration read one paper Flash-attention: compute attention by blocks YouTube Upload five videos this week and start to try codeforces problems. Codeforces problems always c...
这是一篇突如其来的中文稿,原因下述 时间线 2023.12 申请学校 2024.4 收到offer 2024.5 递交签证 2024.8 申请学校延期 2024.10 opr(下签) 下签 从五月递签开始,每天都在盼下签,特别是在8月临近开学的时候,每天要去看机票价格又涨了多少,觉得说不定第二天就下了。 直到最后提交延期申请。 八月底的时候,终于考过了雅思,...
Backgroud There are two common kinds of bound which limited the speed of training in deep learning. Memeory-bound: time spent on memeory-access is bottlenecked Computation-bound: time spent o...
Product LLM acceleration Matrix Multiplcation: Read more LoRA: start reading paper YouTube Upload four videos this week and receive 5 Subscribers Blog Update current blog to jekyll-theme-chi...
Introduction Inspiration: the change in weights during model adaptation have a low “intrinsic rank” Description: Change small matrices A and B when fine-tune, adding A * B to weight W, which sign...