See full post

Benjamin Warner

benjaminwarner.dev

Following · Followers

R&D at answer.ai

Joined November 2024

Posts Replies Media Original posts Likes

Benjamin Warner benjaminwarner.dev · Feb 10
One of the questions we debated while training ModernBERT was whether a modern trained encoder would unlock zero-shot reasoning using only it's generative head? Spoilers: the answer is yes.

View on Bluesky Download image Show all post labels

Reposted by Benjamin Warner
Simon Willison simonwillison.net · Feb 5
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Benjamin Warner
Maria Antoniak handle.invalid · Jan 27
[Not loaded yet]

View on Bluesky Show all post labels

Benjamin Warner benjaminwarner.dev · Jan 23
In addition to being the best retrieval model under 300M params on METB (without extra work), and top 10 for under 1B, here's a fun tidbit from Alibaba's GTE ModernBERT model card: gte-modernbert-base beats gte-qwen1.5-7b on LoCo long context retrieval with 7B less parameters.

View on Bluesky Download image Show all post labels

Reposted by Benjamin Warner
Tom Aarsen tomaarsen.com · Jan 14
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Benjamin Warner
nohtow nohtow.bsky.social · Jan 14
[Not loaded yet]

View on Bluesky Show all post labels

Benjamin Warner benjaminwarner.dev · Jan 10
ModernBERT is officially released on Transformers v4.48.0. You no longer need to install from git to use. If you are plugging ModernBERT into an existing encoder finetuning pipeline, try increasing the learning rate. We've found that ModernBERT tends to prefer a higher LR than older models.

View on Bluesky Download image Show all post labels

Benjamin Warner benjaminwarner.dev · Jan 7
The good: 32GB The bad: $2,000 The Ugly*: PCIe 5 without NVLink

View on Bluesky Show all post labels

Reposted by Benjamin Warner
John West handle.invalid · Jan 1
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Benjamin Warner
Tom Aarsen tomaarsen.com · Dec 31, 2024
[Not loaded yet]

View on Bluesky Show all post labels

Benjamin Warner benjaminwarner.dev · Dec 22, 2024
This week we released ModernBERT, the first encoder to reach SOTA on most common benchmarks across language understanding, retrieval, and code, while running twice as fast as DeBERTaV3 on short context and three times faster than NomicBERT & GTE on long context.

View on Bluesky Download image Show all post labels

Reposted by Benjamin Warner
Mark J. Nelson handle.invalid · Dec 19, 2024
[Not loaded yet]

View on Bluesky Show all post labels

An unhandled error has occurred. Reload 🗙