PopTranslate
Introduction PopTranslate is a translation chrome extension, main features: Pop up translated contents immediately after selecting contents without extra click Show english dictionary when sel...
Introduction PopTranslate is a translation chrome extension, main features: Pop up translated contents immediately after selecting contents without extra click Show english dictionary when sel...
Copilot-typed GitHub Copilot, cursor, GPT-codex, winsurf, trae are representative products of Copilot-typed tools, their function is to help users complete code automatically when necessary. Of cou...
Personal experience ensure the loss/reward curve and performance on test dataset change with same trend, otherwise reward hacking / overfitting appear adjust learning-rate and regularization p...
Cross Entropy idea of fast cross entropy Based on previous knowledge on fast cross-entropy, realizing it by triton doesn’t spend too much. There are only 1e-7 difference between Pytorch and my T...
Summary Methods There are two main methods to acclerate LLM and another tricky methods low-rank: reduce dimension of matrix block: compute matrix with block trick: update model structure o...
Today is the last day for me in netease, I don’t much as much feeling as that in baidu, on the one hand, I spend much more time in baidu, almose five days, one the other hand, that’s my first job, ...
Gemini-cli Gemini-cli is command line tool supported by Gemini-2.5-pro model, it’s a similar product with Claude-code by Anthropic but free, what’s more, Google open source all code of Gemini-cli ...
How much do LLM memorize? key definition unintended memorization: memorize a specific dataset generalization (intended memorization): contains about the true data-generation pr...
ALL thing are certain in traditional computer technology, some programer say that there are beauty of certainty in traditional computer technology compared with current LLMs. For the influence or ...
Main idea Key point it to understand the below pictures Iteration steps for each input, generator G outputs for each output, calculate logits_prob for each token in current, old, referenc...