Sunday Links: Training sets, $2B AI traders and the Chevron doctrine
Happy 4th of July week if you're in the US (and possibly in the UK!). This week's post is very late due to travel; my apologies. Still, here are the week's links (just three this week - back to normal service next weekend!):
- IBM Reveals its Entire 6.48 TB LLM Training Dataset. IBM published detailed information about the training set used for its Granite 13B Open Source LLM. It's really helpful to have this kind of information both to assess the integrity of a model and for others to figure out how to build their own. Props to IBM for getting this out there.
- Bridgewater starts $2 billion fund that uses machine learning for decision-making and will include models from OpenAI, Anthropic and Perplexity. [Paywalled]. Having tested a technology set up with a small sleeve of their Pure Alpha fund — about $100 million, Bridgwater plans to double down and create a separate AI-allocated fund. The core motivation “You’re going to have intelligence that can read every newspaper in the world,” Jensen said. “Machines are better at finding patterns across times and across countries.” From the article, it's clear humans will still guide the process. However, given that Bridgewater is one of the world's largest hedge funds with $100B of assets under management, if more and more assets become machine traded, what will happen to market stability?
- Exploring Model Quantization (Autoround). This article reviews Autoround and touches on a number of other methods for LLM quantization. I'm not pointing to it because Autoround is particularly better than previous methods but as a good basic lay of the land post for quantization. Quantization compacts a model and speeds up inference, over with minimal impact on accuracy; we'll likely see a lot of it in model-building pipelines.
As a side note this week, the US Supreme Court ruled to end the Chevron Doctrine, which, in a nutshell, had courts defer to Federal Agencies on the interpretation of regulations. There has been much back and forth about whether this is a disaster (corporate excess being unbridled) or a boon (remaking poor regulations. However, people seem to be missing the big picture. What will actually happen is a huge amount of legal fees being paid out. Now we know what all the lawyers being replaced by AI are going to be busy doing.
Wishing you a great week.