Google’s Gemini may not help it reclaim its lost ‘AI crown’

Gemini

There is a rather telling story behind why Google’s ‘ChatGPT-killer’ was christened Gemini. Jeff Dean, Chief Scientist at Google and the former head of Google Brain, explained on X: “The Gemini effort came about because we had different teams working on language modelling, and we knew we wanted to start to work together. The twins are the folks in the legacy Brain team (many from the PaLM/PaLM-2 effort) and the legacy DeepMind team (many from the Chinchilla effort) that started to work together on the ambitious multimodal model project we called Gemini, eventually joined by many people from all across Google.” (https://bit.ly/46XO1sk ).

Gemini is Latin for twin, and so it is an apt name. It is also an apt pointer to the struggles that Google has been having to create a Large Language Model (LLM), which rivals upstart OpenAI’s GPT4. It sometimes surprises people that it was Google which ‘invented’ Generative AI and LLMs, more precisely the Transformer architecture which underpins this technology. The concept behind Transformers was revealed in a seminal 2017 paper ‘Attention is all you need’, written largely by researchers at Google Brain. Besides Brain, Google also owns the deep learning powerhouse DeepMind, making it the heads over shoulder leader in Artificial Intelligence. However, Google chose not to take the LLM discovery forward with the aggression and intent it is known for, and it was OpenAI which picked up the ball and carried it to the touchline. The reason could be the famous Innovator’s Dilemma, where Google feared cannibalisation to its fabled Search business, or could be potential reputational risk from LLMs. It also could be fraternal rivalry between the two mighty AI twins if fathered: Jeff Dean’s Google Brain and Demis Hassabis’ DeepMind. Eventually, Google merged the two, made Hassabis the single leader, and set out to build Gemini.

The project to build it has emerged to be the single largest effort by Google in recent times, with Google aiming to take back the AI crown from OpenAI and other assorted competitors like Meta, Inflection and Anthropic. Not to mention arch-competitor Microsoft, which is delivering enterprise products at lightning speed and scale leveraging its tight relationship with OpenAI, and its CEO Satya Nadella taunting Google by making it “dance.” However, It’s earlier release, Bard powered by PaLM, underwhelmed and left it with egg on its face. Google has, therefore, put everything it has into Gemini, making it the largest product effort by Google in recent times. Two thousand engineers, including founder Sergey Brin, rolled up their sleeves to code and build the product.

Launch day demos and announcements were impressive. Gemini is multimodal from word go, working with pictures, videos, text, and voice seamlessly. Its Nano version is built for mobile and can work on-device without depending too much on the cloud. Google announced its integration in the next version of its Pixel mobile phone. Gemini Pro is already integrated into Bard. Gemini also might have reasoning and planning capabilities, which could be built into advanced personal assistants. The Ultra version generated some buzz with Google claiming that it beats GPT4 hollow in some benchmark tests, was better at

generating computer code and summarizing news articles. However, some of this was debunked the next day itself. Google admitted that some parts of its launch video was ‘staged,’ with some of the demos were not live but recorded! Also, Google had used some different, atypical measures to claim a performance superior to GPT4, and thus that was potentially not true too. Finally, Gemini Ultra would be available next year only, by which time GPT4.5 or 5 might be out, moving the goalposts for Google yet again. I wouldn’t write Google off, they have enormous strengths in AI – a phalanx of top talent, compute and money across its merged entities. It formidable distribution strengths in Chrome, Search, YouTube. And the timing is great, given the internal turmoil at OpenAI and some apprehensions on its key talent remaining there.

The other noticeable aspect was the how the Gemini announcement seemed like one more BigTech launch, like an iPhone 15 Plus, Pro and Pro Max. As every tech company battles to launch their own LLM, Matteo Wong of the Atlantic calls it “Generative AI’s iPhone moment,” where “generative AI is more about competition than revolution.” It seems like a déjà vu moment, where, as Wong says, it is an open question “whether the generative-AI race prompts genuine societal transformation or simply provides a new profit model” for Big Techs (https://bit.ly/41sHyED )

While Gemini’s Latin origin is twin, it has a Babylonian history too. In their astronomy, the stars Pollux and Castor were known as the Great Twins. Their Babylonian names, Lugal-irra and Meslamta-ea, meant ‘Mighty King’ and ‘The One arisen from the Underworld”. With Gemini, GPT4 and other mighty GenAI models, it remains to be seen what they will be: mighty kings of technology benefitting society, or the destroyer of worlds.


Subscribe To My Monthly Newsletter