Manifund foxDevifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
GarretteBaker avatarGarretteBaker avatar
Garrett Baker

@GarretteBaker

I'm an independent alignment researcher, self-taught in machine learning, convex optimization, and probability theory

https://github.com/GarretteBaker/
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

For approximately the past year, I’ve been doing alignment research full-time, working on a variety of approaches, and trying to understand the problem in-depth enough to invent new ones. If funded, I plan to continue doing approximately the same work as before, which has historically been scalable mechanistic interpretability, formal and prosaic corrigibility, reflective stability, and a bunch of value theory stuff. Along with lots of upskilling in convex optimization, machine learning, neuroscience, and economics.

My current project is an attempt to connect the tools & theory of singular learning theory with our knowledge of the inductive biases and loss landscapes of large language models.

Projects

Garrett Baker salary to study the development of values of RL agents over time

Outgoing donations

AI Safety Reading Group at metauni [Retrospective]
$10
12 months ago
Act I: Exploring emergent behavior from multi-AI, multi-human interaction
$96
about 1 year ago
Act I: Exploring emergent behavior from multi-AI, multi-human interaction
$50
about 1 year ago
Lightcone Infrastructure
$95
about 1 year ago
Next Steps in Developmental Interpretability
$200
about 1 year ago
Lightcone Infrastructure
$50
about 1 year ago

Comments

Act I: Exploring emergent behavior from multi-AI, multi-human interaction
GarretteBaker avatar

Garrett Baker

about 1 year ago

I have seen some of amp's work, and it is pretty interesting, and novel in the grand scheme of things

🧡
Lightcone Infrastructure
GarretteBaker avatar

Garrett Baker

about 1 year ago

Lightcone consistently does quality things.

Garrett Baker salary to study the development of values of RL agents over time
GarretteBaker avatar

Garrett Baker

about 1 year ago

@Austin Here is the LW post: https://www.lesswrong.com/posts/Bczmi8vjiugDRec7C/what-and-why-developmental-interpretability-of-reinforcement

Transactions

ForDateTypeAmount
AI Safety Reading Group at metauni [Retrospective]12 months agoproject donation10
Act I: Exploring emergent behavior from multi-AI, multi-human interactionabout 1 year agoproject donation96
Act I: Exploring emergent behavior from multi-AI, multi-human interactionabout 1 year agoproject donation50
Lightcone Infrastructureabout 1 year agoproject donation95
<176bd26d-9db4-4c7a-98c0-ba65570fb44c>about 1 year agotip+1
Next Steps in Developmental Interpretabilityabout 1 year agoproject donation200
Lightcone Infrastructureabout 1 year agoproject donation50
Manifund Bankabout 1 year agodeposit+500