Yannic Kilcher Videos (Audio Only)

17 Oct 2023 • EN

Efficient Streaming Language Models with Attention Sinks (Paper Explained)

#llm #ai #chatgpt How does one run inference for a generative autoregressive language model that has been trained with a fixed context size? Streaming LLMs combine the performance of windowed attention, but avoid the drop in performance by using attention sinks - an interesting phenomenon where the token at position 0

17 Oct 2023 • EN

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution (Paper Explained)

#ai #promptengineering #evolution Promptbreeder is a self-improving self-referential system for automated prompt engineering. Give it a task description and a dataset, and it will automatically come up with appropriate prompts for the task. This is achieved by an evolutionary algorithm where not only the prompts, but a

05 Oct 2023 • EN

Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

#ai #retnet #transformers Retention is an alternative to Attention in Transformers that can both be written in a parallel and in a recurrent fashion. This means the architecture achieves training parallelism while maintaining low-cost inference. Experiments in the paper look very promising. OUTLINE: 0:00 - Intro 2:40 -

05 Oct 2023 • EN

Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)

#ai #rlhf #llm ReST uses a bootsrap-like method to produce its own extended dataset and trains on ever higher-quality subsets of it to improve its own reward. The method allows for re-using the same generated data multiple times and thus has an efficiency advantage with respect to Online RL techniques like PPO. Paper:

28 Aug 2023 • EN

[ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise

#mlnews #llama2 #openai Your regular irregular update on the world of Machine Learning. References: https://twitter.com/ylecun/status/1681336284453781505 https://ai.meta.com/llama/ https://about.fb.com/news/2023/07/llama-2-statement-of-support/ https://247wallst.com/special-report/2023/08/12/this-is-the-biggest-social-

28 Aug 2023 • EN

How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich)

#cybercrime #chatgpt #security An interview with Sergey Shykevich, Threat Intelligence Group Manager at Check Point, about how models like ChatGPT have impacted the realm of cyber crime. https://threatmap.checkpoint.com/ Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtub

Yannic Kilcher Videos (Audio Only)

Show episodes

Efficient Streaming Language Models with Attention Sinks (Paper Explained)

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution (Paper Explained)

Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)

[ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise

How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich)

Podcast guests' cohorts

Categories

Show participants

Yannic Kilcher Videos (Audio Only)

Show episodes

Efficient Streaming Language Models with Attention Sinks (Paper Explained)

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution (Paper Explained)

Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)

[ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise

How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich)

Podcast guests' cohorts

Categories

Show participants

Forgot password?

Reset password