Apple Finally Says the A-Word
Q2 2024 was the quarter every major tech company showed its AI hand: GPT-4o, Apple Intelligence, Claude 3.5 Sonnet, and Meta's Llama 3 all landed within weeks.
Apple Finally Says the A-Word
From The Bit Baker Quarterly Roundup — Q2 2024
PLUS: GPT-4o goes multimodal, Claude 3.5 Sonnet leapfrogs Opus, and Meta opens up Llama 3
Good morning, Dave. If Q1 was about three companies fighting for the frontier, Q2 was the quarter everyone else showed up. Apple — the company that spent years pretending AI didn't exist — walked onto the WWDC stage in June and said the word "intelligence" so many times it felt like a brand launch. Because it was.
But Apple wasn't the only one making moves. OpenAI shipped a model that talks, sees, and listens all at once. Anthropic released something faster and smarter than the model it launched three months earlier. And Meta kept pushing its open-source bet with Llama 3. The AI arms race didn't slow down in Q2 — it accelerated into something messier and more interesting.
In this quarter's Bit Baker:
- OpenAI's GPT-4o brings real-time multimodal AI to the masses
- Apple unveils Apple Intelligence and partners with OpenAI
- Anthropic's Claude 3.5 Sonnet outperforms its own flagship
- Meta releases Llama 3 and doubles down on open source
GPT-4o Talks, Sees, and Listens — All at Once
The Bit Baker: OpenAI launched GPT-4o on May 13 — a natively multimodal model that processes text, audio, and vision in a single architecture, and they gave it away free to every ChatGPT user.
Unpacked:
- The "o" stands for "omni." Unlike previous setups that piped audio through separate transcription models, GPT-4o handles voice, text, and images end-to-end — which means it catches tone, laughs, and can be interrupted mid-sentence like a real conversation.
- It matches GPT-4 Turbo's intelligence at twice the speed and half the API cost, with the text and vision capabilities rolling out immediately to free-tier users — a move that undercut every competitor's pricing strategy overnight.
- The live demo showed real-time translation, on-camera math tutoring, and emotion detection, though the voice mode's resemblance to Scarlett Johansson's voice in Her sparked a public dispute that forced OpenAI to pull the "Sky" voice days later.
Bottom line: GPT-4o wasn't just a model upgrade — it was a distribution play. By making frontier-level AI free, OpenAI forced Google and Anthropic to rethink their pricing tiers. The Johansson controversy was a self-inflicted wound, but the product itself moved the floor for what users expect from an AI assistant.
Apple Enters the AI Race with Apple Intelligence
The Bit Baker: At WWDC on June 10, Apple unveiled Apple Intelligence — a personal AI system deeply woven into iOS 18, iPadOS 18, and macOS Sequoia — and announced an OpenAI partnership that puts ChatGPT inside Siri.
Unpacked:
- Apple's approach prioritizes on-device processing: smaller models run locally on Apple silicon, while complex requests route to Private Cloud Compute — dedicated Apple servers that process data without storing it, a privacy architecture no competitor has matched.
- The ChatGPT integration lets Siri hand off questions it can't answer, but with an explicit user consent prompt each time — Apple positioned this as augmenting, not replacing, its own AI stack.
- Writing tools, image generation, smarter notifications, and an overhauled Siri landed across the ecosystem, but only on devices with M-series chips or A17 Pro — cutting off hundreds of millions of older iPhones from the AI upgrade.
Bottom line: Apple didn't try to out-benchmark anyone. Instead it bet that privacy and integration would matter more than raw capability. The strategy makes sense for Apple's installed base, but the device cutoff means most iPhone owners won't experience Apple Intelligence for at least another upgrade cycle.
Claude 3.5 Sonnet Leapfrogs Anthropic's Own Flagship
The Bit Baker: Just three months after launching Claude 3, Anthropic released Claude 3.5 Sonnet on June 20 — a mid-tier model that somehow outperforms Opus, their top-of-the-line offering, on nearly every benchmark.
Unpacked:
- Sonnet 3.5 beat Claude 3 Opus and GPT-4o on graduate-level reasoning (GPQA), coding (HumanEval), and math while running at twice the speed and one-fifth the cost of Opus — an unusual case where a cheaper model genuinely outclasses the premium option.
- Anthropic introduced Artifacts, a new feature that lets Claude generate interactive code, documents, and web previews in a side panel — turning the chatbot into something closer to a collaborative workspace.
- The model shipped free on Claude.ai and immediately went live on Amazon Bedrock and Google Cloud Vertex AI, giving enterprise customers multiple deployment paths from day one.
Bottom line: Claude 3.5 Sonnet broke the assumption that bigger means better. Anthropic proved you can beat your own flagship without scaling up parameters — and in doing so, reset developer expectations about the price-performance curve for frontier models.
Meta Releases Llama 3 and Makes the Open-Source Case
The Bit Baker: Meta released Llama 3 on April 18 in two sizes — 8B and 70B parameters — claiming state-of-the-art performance among open models and backing it up with benchmark scores that challenged some closed competitors.
Unpacked:
- Trained on 15 trillion tokens using 24,000-GPU clusters, Llama 3 outperformed Gemini 1.5 Pro and Claude 3 Sonnet on several benchmarks at the 70B scale — a first for any freely available model.
- Meta embedded Llama 3 directly into Meta AI, its assistant now accessible across Facebook, Instagram, WhatsApp, and Messenger — giving the model instant distribution to billions of users.
- The release came with a teaser: a 400B+ parameter version was already in training, promising multilingual and multimodal capabilities that would push open-source into territory previously reserved for closed APIs.
Bottom line: Meta's open-source strategy isn't altruism — it's a competitive moat. Every developer who builds on Llama is a developer not locked into OpenAI or Google's ecosystem. And with 400B on the horizon, the gap between open and closed models continued to shrink.
The Shortlist
Google showcased Gemini 1.5 Pro with a 2-million-token context window and Project Astra at I/O 2024, previewing a real-time multimodal AI assistant that can see and respond to the physical world through a phone camera.
Google DeepMind published AlphaFold 3 in Nature, expanding AI protein prediction to DNA, RNA, and drug-like molecules — a leap that could accelerate drug discovery timelines by years.
Microsoft introduced Copilot+ PCs at Build 2024, a new hardware category with dedicated AI chips running 40+ trillion operations per second for on-device AI features like Recall and real-time translation.
OpenAI pulled the "Sky" voice from ChatGPT after Scarlett Johansson accused the company of mimicking her voice without consent, raising fresh questions about AI ethics and celebrity likeness rights.
References
- OpenAI launches GPT-4o
- OpenAI's newest model is GPT-4o — live demo coverage
- Apple introduces Apple Intelligence for iPhone, iPad, and Mac
- WWDC24 highlights
- Anthropic releases Claude 3.5 Sonnet
- Anthropic's Claude 3.5 Sonnet and Artifacts feature
- Meta releases Llama 3
- Meta AI assistant built with Llama 3
- Google I/O 2024 announcements collection
- Google DeepMind AlphaFold 3
- Microsoft introduces Copilot+ PCs
- Scarlett Johansson voice controversy with OpenAI