Saturday Links: The R1 arrives, LLM Shrinkage and AI+String Theory

Steven Willmott

27 Apr 2024 • 2 min read

Hello to everyone. Here are the links for the week:

A morning with the Rabbit R1: a fun, funky, unfinished AI gadget. The Rabbit R1 device has arrived. It's a device 1/2 the size of the phone that operates apps on your behalf, and I've written about it before. I'm selfishly interested since I also pre-ordered one. The reviews are pretty much as I expected: interesting, funky, but still flawed. I'm still excited to get the device: no doubt it will improve, and it opens up a lot of potential for service models that don't really exist today. Bypassing the phone is top of the list.
Snowflake releases a flagship generative AI model of its own. The enterprise AI race continues with Snowflake releasing its own model (Databricks did it a couple of weeks ago with DBRX - see a good deep dive analysis on TheSequence here). The trend here seems to me to be that custom "capable" but not amazing LLM-type models will be possible to build for large corporations in a way that attaches to their own data. Snowflake and Databricks will fight for this business as much as Google, Amazon, and Microsoft.
AI Starts to Sift Through String Theory’s Near-Endless Possibilities. Using AI in science is still one of my favorite use cases. In domains like string theory there is so much scope to build in silica simulations and then use AI/ML to explore crazy combinations that the simulator can validate. It won't tell you if a theory is correct but it can find anomalies that help deepen understanding.
Model shrinkage continues - Microsoft releases Phi 3.8-Mini. The new model has solid performance benchmarks and is small enough to run well on an iPhone 14 according to the claims in the technical paper. The benchmark performance is roughly equivalent to the Mistral 7B and Llama3-7B models, but not as good as the top Llama models of GPT 4.0. This progress is still extremely important, though: 6 months ago, it was not at all obvious we would get a capable model running on a phone this quickly.
Pushing beyond transformers - RecurrentGemma. A new R&D model approach that reduces the memory resources LLMs need & can be trained with fewer tokens. Transformers have gotten us a long way, but optimizations will keep coming. Attention is all you need - but we can get good at managing attention.

A final non-directly AI piece of news is Meta announcing that it will open-source its Horizon VR Operating System for use by third-party device manufacturers. It's flown below the radar a little due to other news, but this is Meta playing for VR headset equivalent of the (smartphone) "Android" slot you have on mobile phones. If VR grows fast enough, it may provide a ton of leverage down the line. Will Google or MS be able to marshal the attention to take them on with all this AI stuff happening? There is an AI link because AI will massively drive down the cost of content creation for VR over time, which benefits the aggregator that brings the content to you: Facebook.