There is a graphic that has been used to describe how much bigger GPT5 will be than GPT4, GPT3 is portrayed as a Great White Shark, GPT4 as an Orca, and then GPT5 as a Humpback Whale, indicating the massive increase in data in their training.
It is an interesting analogy in that in conveys the idea of scale, but it is doubly interesting when you think about what those analogies represent. GPT3 is the apex predator in the fish world, a creature that hasn't needed to evolve for millions of years, as it is spectacular at its job, indeed it has one real enemy: the members of the dolphin family, and particularly Orcas. Meanwhile the Humpback eats krill and has on occasion fallen victim to Orcas.
In technology we've seen this battle play out before, it was a battle that started with the likes of VMS, evolved into Windows NT and has generally seen one architecture win out: Unix. Unix has a simple philosophy, to create individual tools that solve specific problems, and provide a mechanism via which they can be chained together.
One of the ways that I get to look clever with new graduates is when I open up the Mac and run a command-line script that does something in a few seconds that they were planning on writing a bunch of code to do, or even manually do it themselves. I've had people say "I could do that in Python", and its true, but the sheer power of find, grep and awk/sed is something that is akin to magic to most people. This power is why Linux won the battle on the server side, it's the foundational power of MacOS, and even Windows these days aspires to the power.
This is not a one-off battle either. The rise of distributed systems, SOA, REST etc were a reaction to unmaintainable monolithic systems which came from the Mainframe era, these new approaches split down systems into individual parts that could be developed and scaled independently. Technologies like Docker and Kubernetes built on virtualization to give the same idea for deployment. In data we went from the mythical great big database in the sky with a single schema against which everyone would comply, to distributed data meshes where each area of the business gets their own view on their own data, bringing in what they need and collaborating by publishing outcomes.
Over in the processor space we've seen the idea of great big single core processors running at max GHz be replaced with multi-core multi-purpose solutions that have different cores for different purposes. Even package vendors, one of the bastions of centralization these days aim to deconstruct their solutions and enable more federated deployments and configurations, not always successfully, but the intention is there.
So the battle between monolith and federated is one that has been fought over and over within technology, and almost without exception the winner has been federation and specialization, the idea of "do one thing well" and then collaborate to drive outcomes. Even where centralization and monoliths have sort of won, those only win within very specific areas.
Making lots of small models or intent applications?
One of the powers of this has been that for edge devices and lower powered devices Unix has always been great at fitting in the form factor. There is a good reason that Raspberry PI has multiple download options the smallest of which is 500MB, but that is massive in comparison to tuning your own where you can drop it below 10MB with a bunch of effort.
Will we start to see this with foundation models? Not only getting the "Small (but big)", "Medium (but huge) and "Absolutely massive (agreed)" with the likes of Llama3.1 but getting, or being able to create through fine-tuning or another technique and even smaller more purpose specific models? When I'm building solutions right now I'm tending to do exactly this, but without the benefits of the smaller model, so I've actually got a larger model that I'm using in 5 different ways, which is still only exploiting a fraction of its capabilities, which means a significant overhead in power, and water consumption, and of course in terms of cost.
As models get deployed to edge devices, like mobile phones, we will see this challenge increase, protecting privacy and removing network latency, but certainly not capable of running a 450 billion or 2 trillion parameter model.
Is "Intents" the new "thing"?
Apple's new approach of "Intents" in Apple Intelligence gives us an interestingly different idea of how apps can focus on tasks but still leverage the capabilities of models, and in turn how different applications can leverage each other through that interface. With a model clearly focused on enabling collaboration across apps rather than a single app, even if it might be all backed by a single platform optimized model. So you might have a video processing application, a social media application, a generative image application and the camera and be able to combine them using an orchestration command, or additional application, so even if you change the generative image application the solution still works as the intent remains the same. It is early stages on this, but it represents a new way of considering building collaborative AI applications.
Will it be one thing well or Linux on the Desktop?
One of the longest running jokes in IT has been "this is the year of Linux on the Desktop", because while it has been a massive success on servers, its been a flop on desktops. This all means that it is possible that a single model approach could be the future, but I really doubt it. I just think that we will need "appropriately sized" AIs that we can manage as people and which are able to clearly define their purpose and be engaged in a collaboration.
It is much easier to do one thing well than try and do everything well in one way.
