Google’s Gemini AI has received a next-generation upgrade, enabling it to process larger prompts more effectively.

Google’s Gemini AI: At the time of writing, Google’s Gemini AI has been in existence for only two months, yet the company is already introducing its next-generation model, Gemini 1.5.

The announcement delves into the technical details, outlining the various improvements made to the AI. However, the key takeaway is the significant enhancement in performance promised by Gemini 1.5. This improvement was achieved through the implementation of a “Mixture-of-Experts architecture” (referred to as MoE), where multiple AI models collaborate. This structural change has not only made Gemini easier to train but also faster at mastering complex tasks.

OpenAIs latest Sora text to video model 1 1

While there are plans to deploy the upgrade across all three major versions of the AI, the version available today for early testing is Gemini 1.5 Pro.

What sets it apart is that the model boasts “a context window of up to 1 million tokens.” Tokens, in the context of generative AI, represent the smallest units of data that LLMs (large language models) utilize “to process and generate text.” Larger context windows enable the AI to manage more information simultaneously. A million tokens is remarkably vast, surpassing the capabilities of GPT-4 Turbo. For comparison, OpenAI’s engine has a context window limit of 128,000 tokens.

Gemini Pro in Action

Amidst all these figures, the real query arises: what does Gemini 1.5 Pro look like in action? Google has released several videos demonstrating the AI’s capabilities. Undoubtedly, it’s intriguing material, showcasing how the upgraded model can analyze and summarize extensive amounts of text based on a given prompt.

In one instance, they provided Gemini 1.5 Pro with the transcript of the Apollo 11 moon mission, spanning over 400 pages. This demonstration illustrated the AI’s ability to “understand, reason about, and identify” specific details within the document. Prompted to locate “comedic moments” during the mission, Gemini 1.5 Pro swiftly identified jokes cracked by the astronauts while in space, attributing them to the respective individuals and offering explanations for any references madeā€”all within a mere 30 seconds.

These analytical capabilities are applicable across various mediums. In another demonstration, the development team presented the AI with a 44-minute Buster Keaton movie. They provided a rough sketch of a gushing water tower and then queried the AI for the timestamp of a scene featuring a water tower. Remarkably, the AI accurately pinpointed the exact moment ten minutes into the film. It’s noteworthy that this was achieved without any additional explanation about the drawing or any accompanying text beyond the question. Gemini 1.5 Pro was able to discern that it depicted a water tower without requiring additional assistance.

Experimental Technology

The model is currently not accessible to the general public. Presently, it is being provided as an early preview to “developers and enterprise customers” through Google’s AI Studio and Vertex AI platforms at no cost. However, the company cautions testers that they may encounter extended latency times as it is still in the experimental stage. There are plans, nevertheless, to enhance speeds in the future.

We have contacted Google to inquire about the expected launch timeline for Gemini 1.5 and Gemini 1.5 Ultra, as well as the broader release of these next-generation AI models. This article will be updated with further information in due course. Meanwhile, you can explore TechRadar’s compilation of the best AI content generators for 2024.

Related Posts

Leave a Comment

Share via
Copy link