|
Green Ash Horizon Fund Monthly Factsheet - February 2025
|
|
The Horizon Fund’s USD IA shareclass fell -8.81% in February (GBP IA -8.72% and AUD IA -8.83%), versus -0.72% for the MSCI World (M1WO).
- The velocity and volatility of policy messaging from the White House is difficult for market participants to navigate. One positive that has come out of this is the political will for a more united Europe. There is an increasing commitment to be self-sufficient in key areas like defence and AI.
- With the bulk of 4Q24 earnings now behind us, the US remains the main beacon of corporate earnings growth in development markets. The S&P 500 has shown sales/EPS growth of +5%/+13% YoY respectively, versus +4%/-3% YoY for the Stoxx 600.
Please click below for monthly factsheet and commentary:
|
|
|
|
Source: Bloomberg; Green Ash Partners. The Green Ash Horizon Strategy track record runs from 30/11/17 to 08/07/21. Fund performance is reported from 09/07/21 launch onwards (USD IA: LU2344660977; performance of other share classes on page 3). Strategy Track record based on managed account held at Interactive Brokers Group Inc. Performance calculated using Broadridge Paladyne Risk Management software. Performance has not been independently audited and is for illustrative purposes only. Past performance is no guarantee of current of future returns and you may consequently get back less than you invested. Benchmark used is M1WO Index
|
|
|
Here are some tidbits on the themes.
|
|
- xAI released Grok 3, which is extremely good, not least because of its tight integration with X and web search. It is the largest model out there, having been trained on xAI's 100k H100 cluster in Memphis, and has had some of the reinforcement learning treatment in post training to add reasoning capabilities. It was released only a month or so after pre-training was completed, whereas other labs spend several month on post training refinements, so we would expect lots of incremental improvement over the coming months
- OpenAI finally released GPT-4.5, with the qualification that it is 'not a frontier model'. This was taken as an admission that scaling pre-training had started to see diminishing returns, but this isn't quite correct. Rather, scaling pre-training has become extremely expensive, while reinforcement learning/scaling test-time compute in reasoning models is now the low hanging fruit to advance model capabilities
- GPT-4.5 was trained with 10x the compute of the original GPT-4 model that was released in March 2023. This puts it at around 2.1e26 FLOPs, with probably 5x the parameters and 2x the training dataset. The jump between GPT-3.5 and GPT-4 was a leap from good to great. Now we are moving from great to really great between model generations, which is making them much harder to evaluate. On key area of improvement is larger models show higher accuracy and hallucinate less, which is crucial for the age of agents and embodied AI
|
|
|
GPT 4.5 and Grok 3 fit on the logarithmic compute trendline, where pre-training FLOPs increase by ~10x every two years
|
|
|
|
Source: Green Ash Partners
|
|
|
GPT 4.5 has higher accuracy rates (blue) and fewer hallucinations (green) than GPT-4o, as well as the o1/o3 reasoning models that use GPT-4o as a base
|
|
|
|
Source: Green Ash Partners
|
|
|
GPT 4.5's most important role is to be the base for future reasoning models
|
|
|
|
Source: Epoch AI, Peter Gostev
|
|
- At the other end of the scale, Google reclaimed the crown open source model crown from DeepSeek with the Gemma 3 family of models. The largest version is just 27BN parameters - far smaller than DeepSeek's V3 base model. It also has image and video understanding, which DeepSeek's models lack
|
|
|
Gemma 3 27B outperforms DeepSeek's v3 base model, and can run on a single GPU
|
|
|
|
Source: Google DeepMind
|
|
- There has also been significant progress in embodied AI. Figure exited their partnership with OpenAI, and announced Helix - a vision-language-action model (VLM) built on an open source base VLM that can run on-board their humanoid robot and generalise to unseen tasks and objects
- Google DeepMind announced Gemini Robotics, demonstrating their natively multimodal Gemini 2 frontier model family can also act as VLA models. The models enable robots to perform sophisticated tasks such as folding origami, packing lunches, and playing tic-tac-toe, demonstrating enhanced dexterity, reasoning, and the ability to learn from experience
|
|
|
Gemini Robotics: Bringing AI to the Physical World
|
|
|
|
Figure
|
|
|
Gemini Robotics: Bringing AI to the Physical World
|
|
|
|
Google DeepMind
|
|
- January data for consumer spending fell on an inflation-adjusted basis - the first drop in over two years
- Google disclosed they now serve more than 5 trillion searches per year (it was 2 trillion last time they disclosed the figures in 2016). This comes as the company launches a full 'AI mode' to testers. This combines Gemini 2 with Google's knowledge graph and ranking algorithms to potentially disintermediate websites altogether - it's a huge change to the way the information within the internet is served to users
|
|
- Ozempic-maker Novo Nordisk has been using Claude to draft clinical study reports. It used to take a team of 50 people 15 weeks to draft these documents, and this has now been reduced to a team of 3 taking a few days (a few minutes for Claude to write the report and a few days to check for errors)
- The ARC institute, in partnership with NVIDIA, released EVO 2, a genome modelling and design foundation model spanning the entire tree of life. While the training dataset of 9.3 trillion base pairs and 40B/7B parameter model sizes are small compared to frontier language models, EVO 2 started to show emergent capabilities - for example, the model was able to classify previously unknown BRCA1 gene variants. Mutations in this gene frequently cause breast/ovarian cancer
- At 2.25e24 FLOPs, EVO 2 40B is at the GPT 3 stage in terms of training compute - the cusp of when LLMs went from ok (and not very useful) to great (and quite useful) - we may be one model generation away from the ChatGPT moment in biology
|
|
|
The pace of scaling AI models in biology is a bit slower than in language models, at 2-4x per year - they are also about 100x smaller than the LLM frontier in terms of pre-training compute
|
|
|
|
Source: Epoch AI
|
|
- There is lots of talk about nuclear power, particularly in the US. Just this week Amazon, Google and Meta joined leading companies in energy-intensive sectors to call for a tripling of nuclear power generation by 2050. What is missing is action, especially on the public policy/regulatory front
- The average age of a nuclear power station in the United States is 41 years. Most reactors in the advanced economies operate under 40-year licences - many will either shut down or need to undergo lifetime extension projects within the next decade to keep them operational
|
|
|
More than half of US nuclear capacity is over 40 years old, while nearly all of China's is less than 20
|
|
|
|
Source: IEA
|
|
|
|
|
|
|
|
|