*|MC:SUBJECT|*

View this email in your browser

Horizon Fund Update: America's Next Top Model

Humans are extremely good at adapting. The idea of ChatGPT would have seemed fanciful as recently as 2021, as would the idea that you could conjure up a working app or videogame, consisting of hundreds of lines of code, with no programming knowledge - just a single, one sentence prompt in natural language.

There were two years between OpenAI's GPT-3 and GPT-4, and now, two years after that, we finally have GPT-5. The level of improvement between generations has accelerated, but, to many, it feels like the opposite, because we have become accustomed to interim model updates nearly every month, with frontier labs constantly leap-frogging each other to claim the top spot. Plotted on a graph, progress is still on an exponential curve in the key metrics that matter to those bought in to the AI bull case.

OpenAI has had a busy week, not only re-establishing America's lead over China on the open-source AI model leaderboards (or at least drawing even), but also displacing Grok 4 at the frontier after less than a month at the top.

OpenAI's 120B parameter open-source model edges out Alibaba's Qwen 3 30B on intelligence, but sits below the larger 235B version. It scores higher than DeepSeek's R1, despite having 5x fewer parameters

Source: Artificial Analysis

GPT-5 only just takes the top spot compared to the other frontier models released in recent months, but is a huge leap over the original GPT-4 released in March 2023

Source: Artificial Analysis; Green Ash Partners. Artificial Analysis Intelligence Index v2.2 incorporates 8 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam, LiveCodeBench, SciCode, AIME, IFBench, AA-LCR

Plotting model releases from the four frontier labs shows AI's advance as a continuum, rather than single leaps with each new model generation. xAI has the had the steepest rate of improvement

It's clear from the chart above that there isn't much daylight between the four models at the frontier, in terms of intelligence. This is partly due to benchmark saturation - the widely used public evals have been largely conquered, and todays leading models are essentially PhD-level experts in every domain. But also, they are all of a similar generation in terms of compute. GPT-5 is rumoured to have been trained on 180-200k H100 GPUs - about the same as the 200k cluster used to train Grok 4.

There is one crucial point of difference though, which is that OpenAI seems to have made significant progress on reducing hallucinations. Not only does this immediately increase model utility in higher value knowledge work (especially healthcare, law and financial services), but it also advances the arrival of AI agents, the highest value unlock of all. Error-rates compound exponentially over multiple steps, and so reducing these is critical to realising the potential of asynchronous AI agents performing tasks over longer time horizons.

GPT-5 has significantly fewer hallucinations than OpenAI's previous state of the art model

Source: Scala

GPT-5 is comparable to or better than human experts in roughly half the cases in their internal benchmark measuring performance on complex, economically valuable knowledge work (spanning over 40 occupations including law, logistics, sales and engineering)

Source: OpenAI

GPT-5 conforms to the exponential trendline of AI models' ability to complete long time-horizon tasks doubling every 7 months

Source: METR

Key Takeaways

There is a cohort of knowledge workers now that have integrated AI into their work, and pay close attention to new model releases - constantly evaluating the jagged frontier of intelligence and testing each release to see what new capabilities might have emerged that can further augment their productivity.

But this is a relatively small subset of the worlds c.1 billion knowledge workers, or of ChatGPT's 700 million users, the vast majority of whom have only played around with the ChatGPT 4o base model. For this majority, GPT-5 will be a major revelation, necessitating a update to their priors on what AI can do today. Even amongst the early adopters, few will have explored the very latest models like o3 Pro or Gemini 2.5 Deep Think, which use parallel compute, or even Grok 4, as these are all locked behind ~$200 per month subscription tiers.

This is perhaps, the largest update at all - not GPT-5's performance on this or that benchmark, or its progress on along time-horizon the curve towards AI agents, but the sudden availability of frontier AI to hundreds of millions of people, for free. This feat was only possible due to massive AI datacentre investments over the last year - one of OpenAI's infrastructure engineers tweeted that OpenAI has built 60+ clusters in last 60 days, adding 200k GPUs. OpenAI's total compute has increased by 15x this year, versus 2024.

And for investors, the takeaway is that everyone is still compute constrained. From hyperscalers like Microsoft Azure, Google Cloud and AWS, to frontier AI research labs, to neo clouds like Nebius - all could go faster and do more were there not bottlenecks in chips, energy and the time it takes to build the physical structures that house AI servers. It is too early to say whether GPT-5 represents the cost/performance trade off to begin truly reshaping the economy, but it is certainly a major step along that path. And with each step comes higher conviction in the need for massive AI datacentre capacity, and the energy to power it.

Green Ash Partners LLP
11 Albemarle Street
London
W1S 4HH

Tel: +44 203 170 7421
Email: info@greenash-partners.com

NOTICE TO RECIPIENTS: The information contained in and accompanying this communication is confidential and may also be legally privileged, or otherwise protected from disclosure. It is intended solely for the use of the intended recipient(s). If you are not the intended recipient of this communication, please delete and destroy all copies in your possession, notify the sender that you have received this communication in error, and note that any review or dissemination of, or the taking of any action in reliance on, this communication is expressly prohibited.

This email is for information purposes only and does not constitute an offer or solicitation of an offer for the product and may not be used as an offer or a solicitation. The opinions herein do not take into account individual clients’ circumstances, objectives, or needs. Before entering into any investment, each client is urged to consider the suitability of the product to their particular circumstances and to independently review, with professional advisors as necessary, the specific risks incurred, in particular at the financial, regulatory, and tax levels.

All and any examples of financial strategies/investments set out in this email are for illustrative purposes only and do not represent past or future performance. The information and analysis contained herein have been based on sources believed to be reliable. However, Green Ash Partners does not guarantee their timeliness, accuracy, or completeness, nor does it accept any liability for any loss or damage resulting from their use. All information and opinions as well as the prices indicated are subject to change without notice. Past performance is no guarantee of current or future returns and you may consequently get back less than you invested.