Saturday Links: Stargate, Jobs and Lavender
Here are this week's links, including one truly awful story:
- The open source Devin's multiply. Last week, we covered an open-source project (Devina) that is following in the footsteps of Magic.Dev, Devin, and other projects that aim to use LLMs to create fully functional "developer colleagues." This week there have been at least two more. It seems these days, you only need an LLM with a large context window, and you can reach human-level AI.
- ‘The machine did it coldly’. This week, there were news reports that the Israeli army used a data/AI system to identify large numbers of Hamas targets in Gaza and use that list to execute strikes, including bombings that could include known probably civilian casualties. Quite apart from the loss of life being terrible in any conflict and any actual moral judgment on strikes being justified or not the events strike a chord from an AI perspective. This isn't a political or moral blog, and I'm not in a position to say what military commanders should/should not do in a certain circumstance. However, I think there are two extremely important things to consider: 1) scale can fundamentally change kind, 2) the coverage of machine coldness is something that dodges proper debate. On the first: data techniques (and AI) have long been used to identify targets, this isn't new, but things change if you generate 37,000 potential targets and act on many of them without proper oversight. In war, we distinguish between weapons on the hierarchy of scale and (for want of a better term) impersonality: fits, knives, guns, machine guns, grenades, mortars, bombs, cluster munitions, landmines, and atomic weapons. We legislated and have international agreements that we won't use the munitions higher up on the scale because they are indiscriminate on who they harm. I would argue the same thing is happening here; technology has been depersonalized and moved an operation to scale with a willful knowledge of increasing collateral damage. If we're going to regulate the use of AI in military operations, it is the scale and impersonality of it's application we should look at. The second is the nature of the press headlines. The Guardian's headline captures my point perfectly: "‘The machine did it coldly’: Israel used AI to identify 37,000 Hamas targets". The subtitle is explanatory, but the key text is "The machine did it coldly." No doubt this garnered more clicks, and it fits with the zeitgeist, but it is wrong. No, the machine did not "do" this coldly; a machine was used for a purpose by knowing individuals. Military choices were made, orders were followed, and tools were used. Again, I may have personal opinions about whether choices should have been made, but the key point is that they were made by humans. Machines, in this context, will always be cold. The bombs dropped were cold. By suggesting a machine did this, we're skipping the debate about whether the humans involved should have made the decisions they did.
- Tech titans assemble to decide which jobs AI should cut first. Sadly, though less grimly, at least we're not done with poor headlines/reporting. This week's reports that consultations are underway around AI's economic impact are hitting the headlines. The potential for job losses (but also creation) is certainly real; societies need to think about the consequences. There's also nothing wrong with asking large companies what they expect to happen. However, if we think large companies will have any control over this (and if they do, that it will be positive), we're being highly delusional. Some large companies will benefit massively from AI, while others will struggle. Above all, though, if we do it right, many people will have access to AI, and smaller companies will do much of the disruption (here, I don't necessarily mean startups; I mean small businesses that can now do more). Asking IBM and Google to help moderate the impact of AI misunderstands how economic change from AI is likely to happen. Jon Stewart's has a take on the AI Apocalypse (AI content from 3m15s).
- Many shot Jailbreaking. Preventing LLMs from producing undesirable results is an increasingly widespread endeavor. Despite possible self-interest in doing otherwise, Anthropic's team, to their credit, keeps producing important results that underline how hard this really is. Their latest paper describes how high-end LLMs with safeguards can get increasingly bamboozled by continuously subjecting them to long sequences of false/objectionable prompts. As a general point here, the input spaces are so large and varied that it will always be extremely difficult to predict what outcomes might be produced. This attack might be thought of as kidnapping a company intern and subjecting them to emotional torture for days before asking to reveal how much sugar the CEO takes in her coffee. I'm not sure how many human interns (or executives) would stand up to this, either.
- Microsoft & OpenAI's "Stargate" AI. Because of course, if you were planning to build a $100B supercomputer you would call it stargate.
Wishing you a wonderful but thoughtful weekend.