Gwen Cheni – Medium

Gwen Cheni

DeepSeek-R1: pure reinforcement learning (RL), no supervised fine-tuning (SFT), no chain-of-thought…

#1minPapers

Jan 21

DeepSeek-R1: pure reinforcement learning (RL), no supervised fine-tuning (SFT), no chain-of-thought…

Jan 21

Takeaways from JPM Healthcare Conf 2025 #JPM2025

This week was my eighth JPM Healthcare Conference. I’ve been to five pre-pandemic, courtesy of being on the buy-side, and having been a JPM…

Jan 17

Jan 17

#1minPapers MSFT’s rStar-Math small language model self-improves and generates own training data

This is the second time in recent months that a small model performed equally well (or better) than the billion-parameter large models…

Jan 12

#1minPapers MSFT’s rStar-Math small language model self-improves and generates own training data

Jan 12

#1minPapers Francois Chollet: use LLMs for tree-search instead of next token prediction

Not a paper, but 90min of Chollet is always worth watching! The ARC Challenge is fascinating because it’s a rapid adaptation, evolution…

Jan 10

Jan 10

#1minPapers “Fourier Analysis Networks” — Yihong Dong et al

Multi-layer Perceptrons (MLPs) are the backbones of LLMs, but they aren’t efficient at modeling periodicity (e.g. rhythmic bass in music)…

Jan 8

#1minPapers “Fourier Analysis Networks” — Yihong Dong et al

Jan 8

#1minPapers Ability to leverage the tools increases with model params— “Toolformer: Language Models…

Yesterday’s #1minPapers noted that model role-play/deception is a problem if models have access to tools. So of course today we’ll dig into…

Jan 6

Jan 6

#1minPapers “Role-Play with Large Language Models” - Shanahan et al

The scare a few weeks ago that o1 was able to duplicate its weights via deception got me very interested in how LLMs can role-play. Dug up…

Jan 5

Jan 5

#1minPapers “Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement…

Got something spicy today: a team at Fudan University in China attempted to reproduce o1. This paper was published 2 wks ago, and full of…

Jan 3

#1minPapers “Scaling of Search and Learning: A Roadmap to Reproduce o1
from Reinforcement…

Jan 3

#1minPapers “Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking” —…

I’m fascinated by reasoning: reasoning allows a model to decompose a challenging computation into smaller steps. This Quiet-STaR model…

Jan 2

#1minPapers “Quiet-STaR: Language Models Can Teach Themselves to
Think Before Speaking” —…

Jan 2

#1minPapers “Critique-out-loud Reward Models” — by Zachary Ankner, Mansheej Paul, Brandon Cui…

I’m down the rabbit hole of optimized reward models. This paper is still in preprint.

Dec 31, 2024

Dec 31, 2024

Gwen Cheni

Gwen Cheni

Building stealth AI+bio. Prev @KhoslaVentures @indbio @sosv🧬💻 @ucsf🌉 @jpmorgan @GoldmanSachs @yale @UChicago @LMU_Muenchen https://linktr.ee/gwencheni

Following

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech