Google | Google AI Developments & News | AI News https://www.artificialintelligence-news.com/categories/ai-companies/google/ Artificial Intelligence News Fri, 02 May 2025 12:38:13 +0000 en-GB hourly 1 https://wordpress.org/?v=6.8.1 https://www.artificialintelligence-news.com/wp-content/uploads/2020/09/cropped-ai-icon-32x32.png Google | Google AI Developments & News | AI News https://www.artificialintelligence-news.com/categories/ai-companies/google/ 32 32 Google AMIE: AI doctor learns to ‘see’ medical images https://www.artificialintelligence-news.com/news/google-amie-ai-doctor-learns-to-see-medical-images/ https://www.artificialintelligence-news.com/news/google-amie-ai-doctor-learns-to-see-medical-images/#respond Fri, 02 May 2025 12:38:12 +0000 https://www.artificialintelligence-news.com/?p=106274 Google is giving its diagnostic AI the ability to understand visual medical information with its latest research on AMIE (Articulate Medical Intelligence Explorer). Imagine chatting with an AI about a health concern, and instead of just processing your words, it could actually look at the photo of that worrying rash or make sense of your […]

The post Google AMIE: AI doctor learns to ‘see’ medical images appeared first on AI News.

]]>
Google is giving its diagnostic AI the ability to understand visual medical information with its latest research on AMIE (Articulate Medical Intelligence Explorer).

Imagine chatting with an AI about a health concern, and instead of just processing your words, it could actually look at the photo of that worrying rash or make sense of your ECG printout. That’s what Google is aiming for.

We already knew AMIE showed promise in text-based medical chats, thanks to earlier work published in Nature. But let’s face it, real medicine isn’t just about words.

Doctors rely heavily on what they can see – skin conditions, readings from machines, lab reports. As the Google team rightly points out, even simple instant messaging platforms “allow static multimodal information (e.g., images and documents) to enrich discussions.”

Text-only AI was missing a huge piece of the puzzle. The big question, as the researchers put it, was “Whether LLMs can conduct diagnostic clinical conversations that incorporate this more complex type of information.”

Google teaches AMIE to look and reason

Google’s engineers have beefed up AMIE using their Gemini 2.0 Flash model as the brains of the operation. They’ve combined this with what they call a “state-aware reasoning framework.” In plain English, this means the AI doesn’t just follow a script; it adapts its conversation based on what it’s learned so far and what it still needs to figure out.

It’s close to how a human clinician works: gathering clues, forming ideas about what might be wrong, and then asking for more specific information – including visual evidence – to narrow things down.

“This enables AMIE to request relevant multimodal artifacts when needed, interpret their findings accurately, integrate this information seamlessly into the ongoing dialogue, and use it to refine diagnoses,” Google explains.

Think of the conversation flowing through stages: first gathering the patient’s history, then moving towards diagnosis and management suggestions, and finally follow-up. The AI constantly assesses its own understanding, asking for that skin photo or lab result if it senses a gap in its knowledge.

To get this right without endless trial-and-error on real people, Google built a detailed simulation lab.

Google created lifelike patient cases, pulling realistic medical images and data from sources like the PTB-XL ECG database and the SCIN dermatology image set, adding plausible backstories using Gemini. Then, they let AMIE ‘chat’ with simulated patients within this setup and automatically check how well it performed on things like diagnostic accuracy and avoiding errors (or ‘hallucinations’).

The virtual OSCE: Google puts AMIE through its paces

The real test came in a setup designed to mirror how medical students are assessed: the Objective Structured Clinical Examination (OSCE).

Google ran a remote study involving 105 different medical scenarios. Real actors, trained to portray patients consistently, interacted either with the new multimodal AMIE or with actual human primary care physicians (PCPs). These chats happened through an interface where the ‘patient’ could upload images, just like you might in a modern messaging app.

Afterwards, specialist doctors (in dermatology, cardiology, and internal medicine) and the patient actors themselves reviewed the conversations.

The human doctors scored everything from how well history was taken, the accuracy of the diagnosis, the quality of the suggested management plan, right down to communication skills and empathy—and, of course, how well the AI interpreted the visual information.

Surprising results from the simulated clinic

Here’s where it gets really interesting. In this head-to-head comparison within the controlled study environment, Google found AMIE didn’t just hold its own—it often came out ahead.

The AI was rated as being better than the human PCPs at interpreting the multimodal data shared during the chats. It also scored higher on diagnostic accuracy, producing differential diagnosis lists (the ranked list of possible conditions) that specialists deemed more accurate and complete based on the case details.

Specialist doctors reviewing the transcripts tended to rate AMIE’s performance higher across most areas. They particularly noted “the quality of image interpretation and reasoning,” the thoroughness of its diagnostic workup, the soundness of its management plans, and its ability to flag when a situation needed urgent attention.

Perhaps one of the most surprising findings came from the patient actors: they often found the AI to be more empathetic and trustworthy than the human doctors in these text-based interactions.

And, on a critical safety note, the study found no statistically significant difference between how often AMIE made errors based on the images (hallucinated findings) compared to the human physicians.

Technology never stands still, so Google also ran some early tests swapping out the Gemini 2.0 Flash model for the newer Gemini 2.5 Flash.

Using their simulation framework, the results hinted at further gains, particularly in getting the diagnosis right (Top-3 Accuracy) and suggesting appropriate management plans.

While promising, the team is quick to add a dose of realism: these are just automated results, and “rigorous assessment through expert physician review is essential to confirm these performance benefits.”

Important reality checks

Google is commendably upfront about the limitations here. “This study explores a research-only system in an OSCE-style evaluation using patient actors, which substantially under-represents the complexity… of real-world care,” they state clearly. 

Simulated scenarios, however well-designed, aren’t the same as dealing with the unique complexities of real patients in a busy clinic. They also stress that the chat interface doesn’t capture the richness of a real video or in-person consultation.

So, what’s the next step? Moving carefully towards the real world. Google is already partnering with Beth Israel Deaconess Medical Center for a research study to see how AMIE performs in actual clinical settings with patient consent.

The researchers also acknowledge the need to eventually move beyond text and static images towards handling real-time video and audio—the kind of interaction common in telehealth today.

Giving AI the ability to ‘see’ and interpret the kind of visual evidence doctors use every day offers a glimpse of how AI might one day assist clinicians and patients. However, the path from these promising findings to a safe and reliable tool for everyday healthcare is still a long one that requires careful navigation.

(Photo by Alexander Sinn)

See also: Are AI chatbots really changing the world of work?

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Google AMIE: AI doctor learns to ‘see’ medical images appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/google-amie-ai-doctor-learns-to-see-medical-images/feed/ 0
Google introduces AI reasoning control in Gemini 2.5 Flash https://www.artificialintelligence-news.com/news/google-introduces-ai-reasoning-control-gemini-2-5-flash/ https://www.artificialintelligence-news.com/news/google-introduces-ai-reasoning-control-gemini-2-5-flash/#respond Wed, 23 Apr 2025 07:01:20 +0000 https://www.artificialintelligence-news.com/?p=105376 Google has introduced an AI reasoning control mechanism for its Gemini 2.5 Flash model that allows developers to limit how much processing power the system expends on problem-solving. Released on April 17, this “thinking budget” feature responds to a growing industry challenge: advanced AI models frequently overanalyse straightforward queries, consuming unnecessary computational resources and driving […]

The post Google introduces AI reasoning control in Gemini 2.5 Flash appeared first on AI News.

]]>
Google has introduced an AI reasoning control mechanism for its Gemini 2.5 Flash model that allows developers to limit how much processing power the system expends on problem-solving.

Released on April 17, this “thinking budget” feature responds to a growing industry challenge: advanced AI models frequently overanalyse straightforward queries, consuming unnecessary computational resources and driving up operational and environmental costs.

While not revolutionary, the development represents a practical step toward addressing efficiency concerns that have emerged as reasoning capabilities become standard in commercial AI software.

The new mechanism enables precise calibration of processing resources before generating responses, potentially changing how organisations manage financial and environmental impacts of AI deployment.

“The model overthinks,” acknowledges Tulsee Doshi, Director of Product Management at Gemini. “For simple prompts, the model does think more than it needs to.”

The admission reveals the challenge facing advanced reasoning models – the equivalent of using industrial machinery to crack a walnut.

The shift toward reasoning capabilities has created unintended consequences. Where traditional large language models primarily matched patterns from training data, newer iterations attempt to work through problems logically, step by step. While this approach yields better results for complex tasks, it introduces significant inefficiency when handling simpler queries.

Balancing cost and performance

The financial implications of unchecked AI reasoning are substantial. According to Google’s technical documentation, when full reasoning is activated, generating outputs becomes approximately six times more expensive than standard processing. The cost multiplier creates a powerful incentive for fine-tuned control.

Nathan Habib, an engineer at Hugging Face who studies reasoning models, describes the problem as endemic across the industry. “In the rush to show off smarter AI, companies are reaching for reasoning models like hammers even where there’s no nail in sight,” he explained to MIT Technology Review.

The waste isn’t merely theoretical. Habib demonstrated how a leading reasoning model, when attempting to solve an organic chemistry problem, became trapped in a recursive loop, repeating “Wait, but…” hundreds of times – essentially experiencing a computational breakdown and consuming processing resources.

Kate Olszewska, who evaluates Gemini models at DeepMind, confirmed Google’s systems sometimes experience similar issues, getting stuck in loops that drain computing power without improving response quality.

Granular control mechanism

Google’s AI reasoning control provides developers with a degree of precision. The system offers a flexible spectrum ranging from zero (minimal reasoning) to 24,576 tokens of “thinking budget” – the computational units representing the model’s internal processing. The granular approach allows for customised deployment based on specific use cases.

Jack Rae, principal research scientist at DeepMind, says that defining optimal reasoning levels remains challenging: “It’s really hard to draw a boundary on, like, what’s the perfect task right now for thinking.”

Shifting development philosophy

The introduction of AI reasoning control potentially signals a change in how artificial intelligence evolves. Since 2019, companies have pursued improvements by building larger models with more parameters and training data. Google’s approach suggests an alternative path focusing on efficiency rather than scale.

“Scaling laws are being replaced,” says Habib, indicating that future advances may emerge from optimising reasoning processes rather than continuously expanding model size.

The environmental implications are equally significant. As reasoning models proliferate, their energy consumption grows proportionally. Research indicates that inferencing – generating AI responses – now contributes more to the technology’s carbon footprint than the initial training process. Google’s reasoning control mechanism offers a potential mitigating factor for this concerning trend.

Competitive dynamics

Google isn’t operating in isolation. The “open weight” DeepSeek R1 model, which emerged earlier this year, demonstrated powerful reasoning capabilities at potentially lower costs, triggering market volatility that reportedly caused nearly a trillion-dollar stock market fluctuation.

Unlike Google’s proprietary approach, DeepSeek makes its internal settings publicly available for developers to implement locally.

Despite the competition, Google DeepMind’s chief technical officer Koray Kavukcuoglu maintains that proprietary models will maintain advantages in specialised domains requiring exceptional precision: “Coding, math, and finance are cases where there’s high expectation from the model to be very accurate, to be very precise, and to be able to understand really complex situations.”

Industry maturation signs

The development of AI reasoning control reflects an industry now confronting practical limitations beyond technical benchmarks. While companies continue to push reasoning capabilities forward, Google’s approach acknowledges a important reality: efficiency matters as much as raw performance in commercial applications.

The feature also highlights tensions between technological advancement and sustainability concerns. Leaderboards tracking reasoning model performance show that single tasks can cost upwards of $200 to complete – raising questions about scaling such capabilities in production environments.

By allowing developers to dial reasoning up or down based on actual need, Google addresses both financial and environmental aspects of AI deployment.

“Reasoning is the key capability that builds up intelligence,” states Kavukcuoglu. “The moment the model starts thinking, the agency of the model has started.” The statement reveals both the promise and the challenge of reasoning models – their autonomy creates both opportunities and resource management challenges.

For organisations deploying AI solutions, the ability to fine-tune reasoning budgets could democratise access to advanced capabilities while maintaining operational discipline.

Google claims Gemini 2.5 Flash delivers “comparable metrics to other leading models for a fraction of the cost and size” – a value proposition strengthened by the ability to optimise reasoning resources for specific applications.

Practical implications

The AI reasoning control feature has immediate practical applications. Developers building commercial applications can now make informed trade-offs between processing depth and operational costs.

For simple applications like basic customer queries, minimal reasoning settings preserve resources while still using the model’s capabilities. For complex analysis requiring deep understanding, the full reasoning capacity remains available.

Google’s reasoning ‘dial’ provides a mechanism for establishing cost certainty while maintaining performance standards.

See also: Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Google introduces AI reasoning control in Gemini 2.5 Flash appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/google-introduces-ai-reasoning-control-gemini-2-5-flash/feed/ 0
DolphinGemma: Google AI model understands dolphin chatter https://www.artificialintelligence-news.com/news/dolphingemma-google-ai-model-understands-dolphin-chatter/ https://www.artificialintelligence-news.com/news/dolphingemma-google-ai-model-understands-dolphin-chatter/#respond Mon, 14 Apr 2025 14:18:49 +0000 https://www.artificialintelligence-news.com/?p=105315 Google has developed an AI model called DolphinGemma to decipher how dolphins communicate and one day facilitate interspecies communication. The intricate clicks, whistles, and pulses echoing through the underwater world of dolphins have long fascinated scientists. The dream has been to understand and decipher the patterns within their complex vocalisations. Google, collaborating with engineers at […]

The post DolphinGemma: Google AI model understands dolphin chatter appeared first on AI News.

]]>
Google has developed an AI model called DolphinGemma to decipher how dolphins communicate and one day facilitate interspecies communication.

The intricate clicks, whistles, and pulses echoing through the underwater world of dolphins have long fascinated scientists. The dream has been to understand and decipher the patterns within their complex vocalisations.

Google, collaborating with engineers at the Georgia Institute of Technology and leveraging the field research of the Wild Dolphin Project (WDP), has unveiled DolphinGemma to help realise that goal.

Announced around National Dolphin Day, the foundational AI model represents a new tool in the effort to comprehend cetacean communication. Trained specifically to learn the structure of dolphin sounds, DolphinGemma can even generate novel, dolphin-like audio sequences.

Over decades, the Wild Dolphin Project – operational since 1985 – has run the world’s longest continuous underwater study of dolphins to develop a deep understanding of context-specific sounds, such as:

  • Signature “whistles”: Serving as unique identifiers, akin to names, crucial for interactions like mothers reuniting with calves.
  • Burst-pulse “squawks”: Commonly associated with conflict or aggressive encounters.
  • Click “buzzes”: Often detected during courtship activities or when dolphins chase sharks.

WDP’s ultimate goal is to uncover the inherent structure and potential meaning within these natural sound sequences, searching for the grammatical rules and patterns that might signify a form of language.

This long-term, painstaking analysis has provided the essential grounding and labelled data crucial for training sophisticated AI models like DolphinGemma.

DolphinGemma: The AI ear for cetacean sounds

Analysing the sheer volume and complexity of dolphin communication is a formidable task ideally suited for AI.

DolphinGemma, developed by Google, employs specialised audio technologies to tackle this. It uses the SoundStream tokeniser to efficiently represent dolphin sounds, feeding this data into a model architecture adept at processing complex sequences.

Based on insights from Google’s Gemma family of lightweight, open models (which share technology with the powerful Gemini models), DolphinGemma functions as an audio-in, audio-out system.

Fed with sequences of natural dolphin sounds from WDP’s extensive database, DolphinGemma learns to identify recurring patterns and structures. Crucially, it can predict the likely subsequent sounds in a sequence—much like human language models predict the next word.

With around 400 million parameters, DolphinGemma is optimised to run efficiently, even on the Google Pixel smartphones WDP uses for data collection in the field.

As WDP begins deploying the model this season, it promises to accelerate research significantly. By automatically flagging patterns and reliable sequences previously requiring immense human effort to find, it can help researchers uncover hidden structures and potential meanings within the dolphins’ natural communication.

The CHAT system and two-way interaction

While DolphinGemma focuses on understanding natural communication, a parallel project explores a different avenue: active, two-way interaction.

The CHAT (Cetacean Hearing Augmentation Telemetry) system – developed by WDP in partnership with Georgia Tech – aims to establish a simpler, shared vocabulary rather than directly translating complex dolphin language.

The concept relies on associating specific, novel synthetic whistles (created by CHAT, distinct from natural sounds) with objects the dolphins enjoy interacting with, like scarves or seaweed. Researchers demonstrate the whistle-object link, hoping the dolphins’ natural curiosity leads them to mimic the sounds to request the items.

As more natural dolphin sounds are understood through work with models like DolphinGemma, these could potentially be incorporated into the CHAT interaction framework.

Google Pixel enables ocean research

Underpinning both the analysis of natural sounds and the interactive CHAT system is crucial mobile technology. Google Pixel phones serve as the brains for processing the high-fidelity audio data in real-time, directly in the challenging ocean environment.

The CHAT system, for instance, relies on Google Pixel phones to:

  • Detect a potential mimic amidst background noise.
  • Identify the specific whistle used.
  • Alert the researcher (via underwater bone-conducting headphones) about the dolphin’s ‘request’.

This allows the researcher to respond quickly with the correct object, reinforcing the learned association. While a Pixel 6 initially handled this, the next generation CHAT system (planned for summer 2025) will utilise a Pixel 9, integrating speaker/microphone functions and running both deep learning models and template matching algorithms simultaneously for enhanced performance.

Google Pixel 9 phone that will be used for the next generation DolphinGemma CHAT system.

Using smartphones like the Pixel dramatically reduces the need for bulky, expensive custom hardware. It improves system maintainability, lowers power requirements, and shrinks the physical size. Furthermore, DolphinGemma’s predictive power integrated into CHAT could help identify mimics faster, making interactions more fluid and effective.

Recognising that breakthroughs often stem from collaboration, Google intends to release DolphinGemma as an open model later this summer. While trained on Atlantic spotted dolphins, its architecture holds promise for researchers studying other cetaceans, potentially requiring fine-tuning for different species’ vocal repertoires..

The aim is to equip researchers globally with powerful tools to analyse their own acoustic datasets, accelerating the collective effort to understand these intelligent marine mammals. We are shifting from passive listening towards actively deciphering patterns, bringing the prospect of bridging the communication gap between our species perhaps just a little closer.

See also: IEA: The opportunities and challenges of AI for global energy

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post DolphinGemma: Google AI model understands dolphin chatter appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/dolphingemma-google-ai-model-understands-dolphin-chatter/feed/ 0
Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date https://www.artificialintelligence-news.com/news/gemini-2-5-google-cooks-most-intelligent-ai-model-to-date/ https://www.artificialintelligence-news.com/news/gemini-2-5-google-cooks-most-intelligent-ai-model-to-date/#respond Wed, 26 Mar 2025 17:17:26 +0000 https://www.artificialintelligence-news.com/?p=105017 Gemini 2.5 is being hailed by Google DeepMind as its “most intelligent AI model” to date. The first model from this latest generation is an experimental version of Gemini 2.5 Pro, which DeepMind says has achieved state-of-the-art results across a wide range of benchmarks. According to Koray Kavukcuoglu, CTO of Google DeepMind, the Gemini 2.5 […]

The post Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date appeared first on AI News.

]]>
Gemini 2.5 is being hailed by Google DeepMind as its “most intelligent AI model” to date.

The first model from this latest generation is an experimental version of Gemini 2.5 Pro, which DeepMind says has achieved state-of-the-art results across a wide range of benchmarks.

According to Koray Kavukcuoglu, CTO of Google DeepMind, the Gemini 2.5 models are “thinking models”.  This signifies their capability to reason through their thoughts before generating a response, leading to enhanced performance and improved accuracy.    

The capacity for “reasoning” extends beyond mere classification and prediction, Kavukcuoglu explains. It encompasses the system’s ability to analyse information, deduce logical conclusions, incorporate context and nuance, and ultimately, make informed decisions.

DeepMind has been exploring methods to enhance AI’s intelligence and reasoning capabilities for some time, employing techniques such as reinforcement learning and chain-of-thought prompting. This groundwork led to the recent introduction of their first thinking model, Gemini 2.0 Flash Thinking.    

“Now, with Gemini 2.5,” says Kavukcuoglu, “we’ve achieved a new level of performance by combining a significantly enhanced base model with improved post-training.”

Google plans to integrate these thinking capabilities directly into all of its future models—enabling them to tackle more complex problems and support more capable, context-aware agents.    

Gemini 2.5 Pro secures the LMArena leaderboard top spot

Gemini 2.5 Pro Experimental is positioned as DeepMind’s most advanced model for handling intricate tasks. As of writing, it has secured the top spot on the LMArena leaderboard – a key metric for assessing human preferences – by a significant margin, demonstrating a highly capable model with a high-quality style:

Screenshot of LMArena leaderboard where the new Gemini 2.5 Pro Experimental AI model from Google DeepMind has just taken the top spot.

Gemini 2.5 is a ‘pro’ at maths, science, coding, and reasoning

Gemini 2.5 Pro has demonstrated state-of-the-art performance across various benchmarks that demand advanced reasoning.

Notably, it leads in maths and science benchmarks – such as GPQA and AIME 2025 – without relying on test-time techniques that increase costs, like majority voting. It also achieved a state-of-the-art score of 18.8% on Humanity’s Last Exam, a dataset designed by subject matter experts to evaluate the human frontier of knowledge and reasoning.

DeepMind has placed significant emphasis on coding performance, and Gemini 2.5 represents a substantial leap forward compared to its predecessor, 2.0, with further improvements in the pipeline. 2.5 Pro excels in creating visually compelling web applications and agentic code applications, as well as code transformation and editing.

On SWE-Bench Verified, the industry standard for agentic code evaluations, Gemini 2.5 Pro achieved a score of 63.8% using a custom agent setup. The model’s reasoning capabilities also enable it to create a video game by generating executable code from a single-line prompt.

Building on its predecessors’ strengths

Gemini 2.5 builds upon the core strengths of earlier Gemini models, including native multimodality and a long context window. 2.5 Pro launches with a one million token context window, with plans to expand this to two million tokens soon. This enables the model to comprehend vast datasets and handle complex problems from diverse information sources, spanning text, audio, images, video, and even entire code repositories.    

Developers and enterprises can now begin experimenting with Gemini 2.5 Pro in Google AI Studio. Gemini Advanced users can also access it via the model dropdown on desktop and mobile platforms. The model will be rolled out on Vertex AI in the coming weeks.    

Google DeepMind encourages users to provide feedback, which will be used to further enhance Gemini’s capabilities.

(Photo by Anshita Nair)

See also: DeepSeek V3-0324 tops non-reasoning AI models in open-source first

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/gemini-2-5-google-cooks-most-intelligent-ai-model-to-date/feed/ 0
Is America falling behind in the AI race? https://www.artificialintelligence-news.com/news/is-america-falling-behind-in-the-ai-race/ https://www.artificialintelligence-news.com/news/is-america-falling-behind-in-the-ai-race/#respond Mon, 24 Mar 2025 09:35:32 +0000 https://www.artificialintelligence-news.com/?p=104963 Several major US artificial intelligence companies have expressed fear around an erosion of America’s edge in AI development. In recent submissions to the US government, the companies warned that Chinese models, such as DeepSeek R1, are becoming more sophisticated and competitive. The submissions, filed in March 2025 in response to a request for input on […]

The post Is America falling behind in the AI race? appeared first on AI News.

]]>
Several major US artificial intelligence companies have expressed fear around an erosion of America’s edge in AI development.

In recent submissions to the US government, the companies warned that Chinese models, such as DeepSeek R1, are becoming more sophisticated and competitive. The submissions, filed in March 2025 in response to a request for input on an AI Action Plan, highlight the growing challenge from China in technological capability and price.

China’s growing AI presence

Chinese state-supported AI model DeepSeek R1 has piqued the interest of US developers. According to OpenAI, DeepSeek demonstrates that the technological gap between the US and China is narrowing. The company described DeepSeek as “state-subsidised, state-controlled, and freely available,” raises concerns about the model’s ability to influence global AI development.

OpenAI compared DeepSeek to Chinese telecommunications company Huawei, warning that Chinese regulations could allow the government to compel DeepSeek to compromise sensitive US systems or infrastructure. Concerns about data privacy were also raised, with OpenAI pointing out that Chinese rules could force DeepSeek to disclose user data to the government, and enhance China’s ability to develop more advanced AI systems.

The competition from China also includes Ernie X1 and Ernie 4.5, released by Baidu, which are designed to compete with Western systems.

According to Baidu, Ernie X1 “delivers performance on par with DeepSeek R1 at only half the price.” Meanwhile, Ernie 4.5 is priced at just 1% of OpenAI’s GPT-4.5 while outperforming it in multiple benchmarks.

DeepSeek’s aggressive pricing strategy is also raising concerns with the US companies. According to Bernstein Research, DeepSeek’s V3 and R1 models are priced “anywhere from 20-40x cheaper” than equivalent models from OpenAI. The pricing pressure could force US developers to adjust their business models to remain competitive.

Baidu’s strategy of open-sourcing its models is also gaining traction. “One thing we learned from DeepSeek is that open-sourcing the best models can greatly help adoption,” Baidu CEO Robin Li said in February. Baidu plans to open-source the Ernie 4.5 series starting June 30, which could accelerate adoption and further increase competitive pressure on US firms.

Cost aside, early user feedback on Baidu’s models has been positive. “[I’ve] been playing around with it for hours, impressive performance,” Alvin Foo, a venture partner at Zero2Launch, said in a post on social media, suggesting China’s AI models are becoming more affordable and effective.

US AI security and economic risks

The submissions also highlight what the US companies perceive as risks to security and the economy.

OpenAI warned that Chinese regulations could allow the government to compel DeepSeek to manipulate its models to compromise infrastructure or sensitive applications, creating vulnerabilities in important systems.

Anthropic’s concerns centred on biosecurity. It disclosed that its own Claude 3.7 Sonnet model demonstrated capabilities in biological weapon development, highlighting the dual-use nature of AI systems.

Anthropic also raised issues with US export controls on AI chips. While Nvidia’s H20 chips meet US export restrictions, they nonetheless perform well in text generation – a important feature for reinforcement learning. Anthropic called on the government to tighten controls to prevent China from gaining a technological edge using the chips.

Google took a more cautious approach, acknowledging security risks yet warned against over-regulation. The company argues that strict AI export rules could harm US competitiveness by limiting business opportunities for domestic cloud providers. Google recommended targeted export controls to protect national security but without disruption to its business operations.

Maintaining US AI competitiveness

All US three companies emphasised the need for better government oversight and infrastructure investment to maintain US AI leadership.

Anthropic warned that by 2027, training a single advanced AI model could require up to five gigawatts of power – enough to power a small city. The company proposed a national target to build 50 additional gigawatts of AI-dedicated power capacity by 2027 and to streamline regulations around power transmission infrastructure.

OpenAI positioned the competition between US and Chinese AI as a contest between democratic and authoritarian AI models. The company argued that promoting a free-market approach would drive better outcomes and maintain America’s technological edge.

Google focused on urging practical measures, including increased federal funding for AI research, improved access to government contracts, and streamlined export controls. The company also recommended more flexible procurement rules to accelerate AI adoption by federal agencies.

Regulatory strategies for US AI

The US companies called for a unified federal approach to AI regulation.

OpenAI proposed a regulatory framework managed by the Department of Commerce, warning that fragmented state-level regulations could drive AI development overseas. The company supported a tiered export control framework, allowing broader access to US-developed AI in democratic countries while restricting it in authoritarian states.

Anthropic called for stricter export controls on AI hardware and training data, warning that even minor improvements in model performance could give China a strategic advantage.

Google focused on copyright and intellectual property rights, stressing that its interpretation of ‘fair use’ is important for AI development. The company warned that overly restrictive copyright rules could disadvantage US AI firms compared to their Chinese competitors.

All three companies stressed the need for faster government adoption of AI. OpenAI recommended removing some existing testing and procurement barriers, while Anthropic supported streamlined procurement processes. Google emphasised the need for improved interoperability in government cloud infrastructure.

See also: The best AI prompt generator: Create perfect AI prompts

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Is America falling behind in the AI race? appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/is-america-falling-behind-in-the-ai-race/feed/ 0
OpenAI and Google call for US government action to secure AI lead https://www.artificialintelligence-news.com/news/openai-and-google-call-us-government-action-secure-ai-lead/ https://www.artificialintelligence-news.com/news/openai-and-google-call-us-government-action-secure-ai-lead/#respond Fri, 14 Mar 2025 16:12:54 +0000 https://www.artificialintelligence-news.com/?p=104794 OpenAI and Google are each urging the US government to take decisive action to secure the nation’s AI leadership. “As America’s world-leading AI sector approaches AGI, with a Chinese Communist Party (CCP) determined to overtake us by 2030, the Trump Administration’s new AI Action Plan can ensure that American-led AI built on democratic principles continues […]

The post OpenAI and Google call for US government action to secure AI lead appeared first on AI News.

]]>
OpenAI and Google are each urging the US government to take decisive action to secure the nation’s AI leadership.

“As America’s world-leading AI sector approaches AGI, with a Chinese Communist Party (CCP) determined to overtake us by 2030, the Trump Administration’s new AI Action Plan can ensure that American-led AI built on democratic principles continues to prevail over CCP-built autocratic, authoritarian AI,” wrote OpenAI, in a letter to the Office of Science and Technology Policy.

In a separate letter, Google echoed this sentiment by stating, “While America currently leads the world in AI – and is home to the most capable and widely adopted AI models and tools – our lead is not assured.”    

A plan for the AI Action Plan

OpenAI highlighted AI’s potential to “scale human ingenuity,” driving productivity, prosperity, and freedom.  The company likened the current advancements in AI to historical leaps in innovation, such as the domestication of the horse, the invention of the printing press, and the advent of the computer.

We are at “the doorstep of the next leap in prosperity,” according to OpenAI CEO Sam Altman. The company stresses the importance of “freedom of intelligence,” advocating for open access to AGI while safeguarding against autocratic control and bureaucratic barriers.

OpenAI also outlined three scaling principles:

  1. The intelligence of an AI model roughly equals the log of the resources used to train and run it.
  1. The cost to use a given level of AI capability falls by about 10x every 12 months.
  1. The amount of calendar time it takes to improve an AI model keeps decreasing.

Google also has a three-point plan for the US to focus on:

  1. Invest in AI: Google called for coordinated action to address the surging energy needs of AI infrastructure, balanced export controls, continued funding for R&D, and pro-innovation federal policy frameworks.
  1. Accelerate and modernise government AI adoption: Google urged the federal government to lead by example through AI adoption and deployment, including implementing multi-vendor, interoperable AI solutions and streamlining procurement processes.
  1. Promote pro-innovation approaches internationally: Google advocated for an active international economic policy to support AI innovation, championing market-driven technical standards, working with aligned countries to address national security risks, and combating restrictive foreign AI barriers.

AI policy recommendations for the US government

Both companies provided detailed policy recommendations to the US government.

OpenAI’s proposals include:

  • A regulatory strategy that ensures the freedom to innovate through voluntary partnership between the federal government and the private sector.    
  • An export control strategy that promotes the global adoption of American AI systems while protecting America’s AI lead.    
  • A copyright strategy that protects the rights of content creators while preserving American AI models’ ability to learn from copyrighted material.    
  • An infrastructure opportunity strategy to drive growth, including policies to support a thriving AI-ready workforce and ecosystems of labs, start-ups, and larger companies.    
  • An ambitious government adoption strategy to ensure the US government itself sets an example of using AI to benefit its citizens.    

Google’s recommendations include:

  • Advancing energy policies to power domestic data centres, including transmission and permitting reform.    
  • Adopting balanced export control policies that support market access while targeting pertinent risks.    
  • Accelerating AI R&D, streamlining access to computational resources, and incentivising public-private partnerships.    
  • Crafting a pro-innovation federal framework for AI, including federal legislation that prevents a patchwork of state laws, ensuring industry has access to data that enables fair learning, emphasising sector-specific and risk-based AI governance, and supporting workforce initiatives to develop AI skills.    

Both OpenAI and Google emphasise the need for swift and decisive action. OpenAI warned that America’s lead in AI is narrowing, while Google stressed that policy decisions will determine the outcome of the global AI competition.

“We are in a global AI competition, and policy decisions will determine the outcome,” Google explained. “A pro-innovation approach that protects national security and ensures that everyone benefits from AI is essential to realising AI’s transformative potential and ensuring that America’s lead endures.”

(Photo by Nils Huenerfuerst

See also: Gemma 3: Google launches its latest open AI models

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post OpenAI and Google call for US government action to secure AI lead appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/openai-and-google-call-us-government-action-secure-ai-lead/feed/ 0
Gemma 3: Google launches its latest open AI models https://www.artificialintelligence-news.com/news/gemma-3-google-launches-its-latest-open-ai-models/ https://www.artificialintelligence-news.com/news/gemma-3-google-launches-its-latest-open-ai-models/#respond Wed, 12 Mar 2025 09:08:41 +0000 https://www.artificialintelligence-news.com/?p=104758 Google has launched Gemma 3, the latest version of its family of open AI models that aim to set a new benchmark for AI accessibility. Built upon the foundations of the company’s Gemini 2.0 models, Gemma 3 is engineered to be lightweight, portable, and adaptable—enabling developers to create AI applications across a wide range of […]

The post Gemma 3: Google launches its latest open AI models appeared first on AI News.

]]>
Google has launched Gemma 3, the latest version of its family of open AI models that aim to set a new benchmark for AI accessibility.

Built upon the foundations of the company’s Gemini 2.0 models, Gemma 3 is engineered to be lightweight, portable, and adaptable—enabling developers to create AI applications across a wide range of devices.  

This release comes hot on the heels of Gemma’s first birthday, an anniversary underscored by impressive adoption metrics. Gemma models have achieved more than 100 million downloads and spawned the creation of over 60,000 community-built variants. Dubbed the “Gemmaverse,” this ecosystem signals a thriving community aiming to democratise AI.  

“The Gemma family of open models is foundational to our commitment to making useful AI technology accessible,” explained Google.

Gemma 3: Features and capabilities

Gemma 3 models are available in various sizes – 1B, 4B, 12B, and 27B parameters – allowing developers to select a model tailored to their specific hardware and performance requirements. These models promise faster execution, even on modest computational setups, without compromising functionality or accuracy.

Here are some of the standout features of Gemma 3:  

  • Single-accelerator performance: Gemma 3 sets a new benchmark for single-accelerator models. In preliminary human preference evaluations on the LMArena leaderboard, Gemma 3 outperformed rivals including Llama-405B, DeepSeek-V3, and o3-mini.
  • Multilingual support across 140 languages: Catering to diverse audiences, Gemma 3 comes with pretrained capabilities for over 140 languages. Developers can create applications that connect with users in their native tongues, expanding the global reach of their projects.  
  • Sophisticated text and visual analysis: With advanced text, image, and short video reasoning capabilities, developers can implement Gemma 3 to craft interactive and intelligent applications—addressing an array of use cases from content analysis to creative workflows.  
  • Expanded context window: Offering a 128k-token context window, Gemma 3 can analyse and synthesise large datasets, making it ideal for applications requiring extended content comprehension.
  • Function calling for workflow automation: With function calling support, developers can utilise structured outputs to automate processes and build agentic AI systems effortlessly.
  • Quantised models for lightweight efficiency: Gemma 3 introduces official quantised versions, significantly reducing model size while preserving output accuracy—a bonus for developers optimising for mobile or resource-constrained environments.

The model’s performance advantages are clearly illustrated in the Chatbot Arena Elo Score leaderboard. Despite requiring just a single NVIDIA H100 GPU, the flagship 27B version of Gemma 3 ranks among the top chatbots, achieving an Elo score of 1338. Many competitors demand up to 32 GPUs to deliver comparable performance.

Google Gemma 3 performance illustrated on benchmark against both open source and proprietary AI models in the Chatbot Arena Elo Score leaderboard.

One of Gemma 3’s strengths lies in its adaptability within developers’ existing workflows.  

  • Diverse tooling compatibility: Gemma 3 supports popular AI libraries and tools, including Hugging Face Transformers, JAX, PyTorch, and Google AI Edge. For optimised deployment, platforms such as Vertex AI or Google Colab are ready to help developers get started with minimal hassle.  
  • NVIDIA optimisations: Whether using entry-level GPUs like Jetson Nano or cutting-edge hardware like Blackwell chips, Gemma 3 ensures maximum performance, further simplified through the NVIDIA API Catalog.  
  • Broadened hardware support: Beyond NVIDIA, Gemma 3 integrates with AMD GPUs via the ROCm stack and supports CPU execution with Gemma.cpp for added versatility.

For immediate experiments, users can access Gemma 3 models via platforms such as Hugging Face and Kaggle, or take advantage of the Google AI Studio for in-browser deployment.

Advancing responsible AI  

“We believe open models require careful risk assessment, and our approach balances innovation with safety,” explains Google.  

Gemma 3’s team adopted stringent governance policies, applying fine-tuning and robust benchmarking to align the model with ethical guidelines. Given the models enhanced capabilities in STEM fields, it underwent specific evaluations to mitigate risks of misuse, such as generating harmful substances.

Google is pushing for collective efforts within the industry to create proportionate safety frameworks for increasingly powerful models.

To play its part, Google is launching ShieldGemma 2. The 4B image safety checker leverages Gemma 3’s architecture and outputs safety labels across categories such as dangerous content, explicit material, and violence. While offering out-of-the-box solutions, developers can customise the tool to meet tailored safety requirements.

The “Gemmaverse” isn’t just a technical ecosystem, it’s a community-driven movement. Projects such as AI Singapore’s SEA-LION v3, INSAIT’s BgGPT, and Nexa AI’s OmniAudio are testament to the power of collaboration within this ecosystem.  

To bolster academic research, Google has also introduced the Gemma 3 Academic Program. Researchers can apply for $10,000 worth of Google Cloud credits to accelerate their AI-centric projects. Applications open today and remain available for four weeks.  

With its accessibility, capabilities, and widespread compatibility, Gemma 3 makes a strong case for becoming a cornerstone in the AI development community.

(Image credit: Google)

See also: Alibaba Qwen QwQ-32B: Scaled reinforcement learning showcase

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Gemma 3: Google launches its latest open AI models appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/gemma-3-google-launches-its-latest-open-ai-models/feed/ 0
Big tech’s $320B AI spend defies efficiency race https://www.artificialintelligence-news.com/news/big-techs-320b-ai-spend-defies-efficiency-race/ https://www.artificialintelligence-news.com/news/big-techs-320b-ai-spend-defies-efficiency-race/#respond Wed, 12 Feb 2025 11:57:25 +0000 https://www.artificialintelligence-news.com/?p=104318 Tech giants are beginning an unprecedented $320 billion AI infrastructure spending spree in 2025, brushing aside concerns about more efficient AI models from challengers like DeepSeek. The massive investment push from Amazon, Microsoft, Google, and Meta signals the big players’ unwavering conviction that AI’s future demands bold infrastructure bets, despite (or perhaps because of) emerging […]

The post Big tech’s $320B AI spend defies efficiency race appeared first on AI News.

]]>
Tech giants are beginning an unprecedented $320 billion AI infrastructure spending spree in 2025, brushing aside concerns about more efficient AI models from challengers like DeepSeek. The massive investment push from Amazon, Microsoft, Google, and Meta signals the big players’ unwavering conviction that AI’s future demands bold infrastructure bets, despite (or perhaps because of) emerging efficiency breakthroughs.

The stakes are high, with collective capital expenditure jumping 30% up from 2024’s $246 billion investment. While investors may question the necessity of such aggressive spending, tech leaders are doubling down on their belief that AI represents a transformative opportunity worth every dollar.

Amazon stands at the forefront of this AI arms spend, according toa reportby Business Insider. Amazon is flexing its financial muscle with a planned $100 billion capital expenditure for 2025 – a dramatic leap from its $77 billion last year. AWS chief Andy Jassy isn’t mincing words, calling AI a “once-in-a-lifetime business opportunity” that demands aggressive investment.

Microsoft’s Satya Nadella also has a bullish stance with his own hard numbers. Having earmarked $80 billion for AI infrastructure in 2025, Microsoft’s existing AI ventures are already delivering; Nadella has spoken of $13 billion annual revenue from AI and 175% year-over-year growth.

His perspective draws from economic wisdom: citing the Jevons paradox, he argues that making AI more efficient and accessible will spark an unprecedented surge in demand.

Not to be outdone, Google parent Alphabet is pushing all its chips to the centre of the table, with a $75 billion infrastructure investment in 2025, dwarfing analysts’ expectations of $58 billion. Despite market jitters about cloud growth and AI strategy, CEO Sundar Pichai maintains Google’s product innovation engine is firing on all cylinders.

Meta’s approach is to pour $60-65 billion into capital spending in 2025 – up from $39 billion in 2024. The company is carving its own path by championing an “American standard” for open-source AI models, a strategy has caught investor attention, particularly given Meta’s proven track record in monetising AI through sophisticated ad targeting.

The emergence of DeepSeek’s efficient AI models has sparked some debate in investment circles. Investing.com’s Jesse Cohen voices growing demands for concrete returns on existing AI investments. Yet Wedbush’s Dan Ives dismisses such concerns, likening DeepSeek to “the Temu of AI” and insisting the revolution is just beginning.

The market’s response to these bold plans tells a mixed story. Meta’s strategy has won investor applause, while Amazon and Google face more sceptical reactions, with stock drops of 5% and 8% respectively following spending announcements in earnings calls. Yet tech leaders remain undeterred, viewing robust AI infrastructure as non-negotiable for future success.

The intensity of infrastructure investment suggests a reality: technological breakthroughs in AI efficiency aren’t slowing the race – they’re accelerating it. As big tech pours unprecedented resources into AI development, it’s betting that increased efficiency will expand rather than contract the market for AI services.

The high-stakes gamble on AI’s future reveals a shift in how big tech views investment. Rather than waiting to see how efficiency improvements might reduce costs, it’s are scaling up aggressively, convinced that tomorrow’s AI landscape will demand more infrastructure, not less. In this view, DeepSeek’s breakthroughs aren’t a threat to their strategy – they’re validation of AI’s expanding potential.

The message from Silicon Valley is that the AI revolution demands massive infrastructure investment, and the giants of tech are all in. The question isn’t whether to invest in AI infrastructure, but whether $320 billion will be enough to meet the coming surge in demand.

See also: DeepSeek ban? China data transfer boosts security concerns

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Big tech’s $320B AI spend defies efficiency race appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/big-techs-320b-ai-spend-defies-efficiency-race/feed/ 0
Gemini 2.0: Google ushers in the agentic AI era  https://www.artificialintelligence-news.com/news/gemini-2-0-google-ushers-in-agentic-ai-era/ https://www.artificialintelligence-news.com/news/gemini-2-0-google-ushers-in-agentic-ai-era/#respond Wed, 11 Dec 2024 16:52:09 +0000 https://www.artificialintelligence-news.com/?p=16694 Google CEO Sundar Pichai has announced the launch of Gemini 2.0, a model that represents the next step in Google’s ambition to revolutionise AI. A year after introducing the Gemini 1.0 model, this major upgrade incorporates enhanced multimodal capabilities, agentic functionality, and innovative user tools designed to push boundaries in AI-driven technology. Leap towards transformational […]

The post Gemini 2.0: Google ushers in the agentic AI era  appeared first on AI News.

]]>
Google CEO Sundar Pichai has announced the launch of Gemini 2.0, a model that represents the next step in Google’s ambition to revolutionise AI.

A year after introducing the Gemini 1.0 model, this major upgrade incorporates enhanced multimodal capabilities, agentic functionality, and innovative user tools designed to push boundaries in AI-driven technology.

Leap towards transformational AI  

Reflecting on Google’s 26-year mission to organise and make the world’s information accessible, Pichai remarked, “If Gemini 1.0 was about organising and understanding information, Gemini 2.0 is about making it much more useful.”

Gemini 1.0, released in December 2022, was notable for being Google’s first natively multimodal AI model. The first iteration excelled at understanding and processing text, video, images, audio, and code. Its enhanced 1.5 version became widely embraced by developers for its long-context understanding, enabling applications such as the productivity-focused NotebookLM.

Now, with Gemini 2.0, Google aims to accelerate the role of AI as a universal assistant capable of native image and audio generation, better reasoning and planning, and real-world decision-making capabilities. In Pichai’s words, the development represents the dawn of an “agentic era.”

“We have been investing in developing more agentic models, meaning they can understand more about the world around you, think multiple steps ahead, and take action on your behalf, with your supervision,” Pichai explained.

Gemini 2.0: Core features and availability

At the heart of today’s announcement is the experimental release of Gemini 2.0 Flash, the flagship model of Gemini’s second generation. It builds upon the foundations laid by its predecessors while delivering faster response times and advanced performance.

Gemini 2.0 Flash supports multimodal inputs and outputs, including the ability to generate native images in conjunction with text and produce steerable text-to-speech multilingual audio. Additionally, users can benefit from native tool integration such as Google Search and even third-party user-defined functions.

Developers and businesses will gain access to Gemini 2.0 Flash via the Gemini API in Google AI Studio and Vertex AI, while larger model sizes are scheduled for broader release in January 2024.

For global accessibility, the Gemini app now features a chat-optimised version of the 2.0 Flash experimental model. Early adopters can experience this updated assistant on desktop and mobile, with a mobile app rollout imminent.

Products such as Google Search are also being enhanced with Gemini 2.0, unlocking the ability to handle complex queries like advanced math problems, coding enquiries, and multimodal questions.

Comprehensive suite of AI innovations  

The launch of Gemini 2.0 comes with compelling new tools that showcase its capabilities.

One such feature, Deep Research, functions as an AI research assistant, simplifying the process of investigating complex topics by compiling information into comprehensive reports. Another upgrade enhances Search with Gemini-enabled AI Overviews that tackle intricate, multi-step user queries.

The model was trained using Google’s sixth-generation Tensor Processing Units (TPUs), known as Trillium, which Pichai notes “powered 100% of Gemini 2.0 training and inference.”

Trillium is now available for external developers, allowing them to benefit from the same infrastructure that supports Google’s own advancements.

Pioneering agentic experiences  

Accompanying Gemini 2.0 are experimental “agentic” prototypes built to explore the future of human-AI collaboration, including:

  • Project Astra: A universal AI assistant

First introduced at I/O earlier this year, Project Astra taps into Gemini 2.0’s multimodal understanding to improve real-world AI interactions. Trusted testers have trialled the assistant on Android, offering feedback that has helped refine its multilingual dialogue, memory retention, and integration with Google tools like Search, Lens, and Maps. Astra has also demonstrated near-human conversational latency, with further research underway for its application in wearable technology, such as prototype AI glasses.

  • Project Mariner: Redefining web automation 

Project Mariner is an experimental web-browsing assistant that uses Gemini 2.0’s ability to reason across text, images, and interactive elements like forms within a browser. In initial tests, it achieved an 83.5% success rate on the WebVoyager benchmark for completing end-to-end web tasks. Early testers using a Chrome extension are helping to refine Mariner’s capabilities while Google evaluates safety measures that ensure the technology remains user-friendly and secure.

  • Jules: A coding agent for developers  

Jules, an AI-powered assistant built for developers, integrates directly into GitHub workflows to address coding challenges. It can autonomously propose solutions, generate plans, and execute code-based tasks—all under human supervision. This experimental endeavour is part of Google’s long-term goal to create versatile AI agents across various domains.

  • Gaming applications and beyond  

Extending Gemini 2.0’s reach into virtual environments, Google DeepMind is working with gaming partners like Supercell on intelligent game agents. These experimental AI companions can interpret game actions in real-time, suggest strategies, and even access broader knowledge via Search. Research is also being conducted into how Gemini 2.0’s spatial reasoning could support robotics, opening doors for physical-world applications in the future.

Addressing responsibility in AI development

As AI capabilities expand, Google emphasises the importance of prioritising safety and ethical considerations.

Google claims Gemini 2.0 underwent extensive risk assessments, bolstered by the Responsibility and Safety Committee’s oversight to mitigate potential risks. Additionally, its embedded reasoning abilities allow for advanced “red-teaming,” enabling developers to evaluate security scenarios and optimise safety measures at scale.

Google is also exploring safeguards to address user privacy, prevent misuse, and ensure AI agents remain reliable. For instance, Project Mariner is designed to prioritise user instructions while resisting malicious prompt injections, preventing threats like phishing or fraudulent transactions. Meanwhile, privacy controls in Project Astra make it easy for users to manage session data and deletion preferences.

Pichai reaffirmed the company’s commitment to responsible development, stating, “We firmly believe that the only way to build AI is to be responsible from the start.”

With the Gemini 2.0 Flash release, Google is edging closer to its vision of building a universal assistant capable of transforming interactions across domains.

See also: Machine unlearning: Researchers make AI models ‘forget’ data

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Gemini 2.0: Google ushers in the agentic AI era  appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/gemini-2-0-google-ushers-in-agentic-ai-era/feed/ 0
Google launches Veo and Imagen 3 generative AI models https://www.artificialintelligence-news.com/news/google-launches-veo-and-imagen-3-generative-ai-models/ https://www.artificialintelligence-news.com/news/google-launches-veo-and-imagen-3-generative-ai-models/#respond Tue, 03 Dec 2024 14:30:05 +0000 https://www.artificialintelligence-news.com/?p=16626 Google Cloud has launched two generative AI models on its Vertex AI platform, Veo and Imagen 3, amid reports of surging revenue growth among enterprises leveraging the technology. According to Google Cloud’s data, 86% of enterprise companies currently using generative AI in production environments have witnessed increased revenue, with an estimated average growth of 6%.  […]

The post Google launches Veo and Imagen 3 generative AI models appeared first on AI News.

]]>
Google Cloud has launched two generative AI models on its Vertex AI platform, Veo and Imagen 3, amid reports of surging revenue growth among enterprises leveraging the technology.

According to Google Cloud’s data, 86% of enterprise companies currently using generative AI in production environments have witnessed increased revenue, with an estimated average growth of 6%. 

This metric has driven the tech giant’s latest innovation push, resulting in the introduction of Veo – its most sophisticated video generation model to date – and Imagen 3, an advanced text-to-image generation system.

Breaking ground

Veo, now available in private preview on Vertex AI, represents a milestone as Google becomes the first hyperscaler to offer an image-to-video model. The technology enables businesses to generate high-quality videos from simple text or image prompts, potentially revolutionising video production workflows across industries.

Imagen 3 – scheduled for release to all Vertex AI customers next week – promises unprecedented realism in generated images, with marked improvements in detail, lighting, and artifact reduction. The model includes new features for enterprise customers on an allowlist, including advanced editing capabilities and brand customisation options.

Example images generated by the Imagen 3 generative AI (GenAI) model by Google, available on its Vertex AI platform.

Transforming operations

Several major firms have begun implementing these technologies into their operations.

Mondelez International, the company behind brands such as Oreo, Cadbury, and Chips Ahoy!, is using the technology to accelerate campaign content creation across its global portfolio of brands.

Jon Halvorson, SVP of Consumer Experience & Digital Commerce at Mondelez International, explained: “Our collaboration with Google Cloud has been instrumental in harnessing the power of generative AI, notably through Imagen 3, to revolutionise content production.

“This technology has enabled us to produce hundreds of thousands of customised assets, enhancing creative quality while significantly reducing both time to market and costs.”

Knowledge sharing platform Quora has developed Poe, a platform that enables users to interact with generative AI models. Veo and Imagen are now integrated with Poe.

Spencer Chan, Product Lead for Poe at Quora, commented: “We created Poe to democratise access to the world’s best gen AI models. With Veo, we’re now enabling millions of users to bring their ideas to life through stunning, high-quality generative video.”

Safety and security

In response to growing concerns about AI-generated content, Google has implemented robust safety features in both models. These include:

  • Digital watermarking through Google DeepMind’s SynthID.
  • Built-in safety filters to prevent harmful content creation.
  • Strict data governance policies ensure customer data protection.
  • Industry-first copyright indemnity for generative AI services.

The launch of these new models signals Google’s growing influence in the enterprise AI space and suggests a shift toward more sophisticated, integrated AI solutions for business applications.

(Imagery Credit: Google Cloud)

See also: Alibaba Marco-o1: Advancing LLM reasoning capabilities

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Google launches Veo and Imagen 3 generative AI models appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/google-launches-veo-and-imagen-3-generative-ai-models/feed/ 0
Big tech’s AI spending hits new heights https://www.artificialintelligence-news.com/news/big-tech-ai-spending-hits-new-heights/ https://www.artificialintelligence-news.com/news/big-tech-ai-spending-hits-new-heights/#respond Fri, 22 Nov 2024 12:02:34 +0000 https://www.artificialintelligence-news.com/?p=16537 In 2024, Big Tech is all-in on artificial intelligence, with companies like Microsoft, Amazon, Alphabet, and Meta leading the way. Their combined spending on AI is projected to exceed a jaw-dropping $240 billion. Why? Because AI isn’t just the future—it’s the present, and the demand for AI-powered tools and infrastructure has never been higher. The […]

The post Big tech’s AI spending hits new heights appeared first on AI News.

]]>
In 2024, Big Tech is all-in on artificial intelligence, with companies like Microsoft, Amazon, Alphabet, and Meta leading the way.

Their combined spending on AI is projected to exceed a jaw-dropping $240 billion. Why? Because AI isn’t just the future—it’s the present, and the demand for AI-powered tools and infrastructure has never been higher. The companies aren’t just keeping up; they’re setting the pace for the industry.

The scale of their investment is hard to ignore. In the first half of 2023, tech giants poured $74 billion into capital expenditure. By Q3, that number had jumped to $109 billion. In mid-2024, spending reached $104 billion, a remarkable 47% rise over the same period a year earlier. By Q3, the total hit $171 billion.

If this pattern continues, Q4 might add another $70 billion, bringing the total to a truly staggering $240 billion for the year.

Why so much spending?

AI’s potential is immense, and companies are making sure they’re positioned to reap the rewards.

  • A growing market: AI is projected to create $20 trillion in global economic impact by 2030. In countries like India, AI could contribute $500 billion to GDP by 2025. With stakes this high, big tech isn’t hesitating to invest heavily.
  • Infrastructure demands: Training and running AI models require massive investment in infrastructure, from data centres to high-performance GPUs. Alphabet increased its capital expenditures by 62% last quarter compared to the previous year, even as it cut its workforce by 9,000 employees to manage costs.
  • Revenue potential: AI is already proving its value. Microsoft’s AI products are expected to generate $10 billion annually—the fastest-growing segment in the company’s history. Alphabet, meanwhile, uses AI to write over 25% of its new code, streamlining operations.

Amazon is also ramping up, with plans to spend $75 billion on capital expenditure in 2024. Meta’s forecast is not far behind, with estimates between $38 and $40 billion. Across the board, organisations recognise that maintaining their edge in AI requires sustained and significant investment.

Supporting revenue streams

What keeps the massive investments keep on coming is the strength of big tech’s core businesses. Last quarter, Alphabet’s digital advertising machine, which is powered by Google’s search engine, generated $49.39 billion in ad revenue, a 12% year-over-year increase. This as a solid foundation that allows Alphabet to pour resources into building out its AI arsenal without destabilising the bottom line.

Microsoft’s diversified revenue streams are another example. While the company spent $20 billion on AI and cloud infrastructure last quarter, its productivity segment, which includes Office, grew by 12% to $28.3 billion, and its personal computing business, boosted by Xbox and the Activision Blizzard acquisition, grew 17% to $13.2 billion. These successes demonstrate how AI investments can support broader growth strategies.

The financial payoff

Big tech is already seeing the benefits of its heavy spending. Microsoft’s Azure platform has seen substantial growth, with its AI income approaching $6 billion. Amazon’s AI business is growing at triple-digit rates, and Alphabet reported a 34% jump in profits last quarter, with cloud revenue playing a major role.

Meta, while primarily focused on advertising, is leveraging AI to make its platforms more engaging. AI-driven tools, such as improved feeds and search features keep users on its platforms longer, resulting in new revenue growth.

AI spending shows no signs of slowing down. Tech leaders at Microsoft and Alphabet view AI as a long-term investment critical to their future success. And the results speak for themselves: Alphabet’s cloud revenue is up 35%, while Microsoft’s cloud business grew 20% last quarter.

For the time being, the focus is on scaling up infrastructure and meeting demand. However, the real transformation will come when big tech unlocks AI’s full potential, transforming industries and redefining how we work and live.

By investing in high-quality, centralised data strategies, businesses can ensure trustworthy and accurate AI implementations, and unlock AI’s full potential to drive innovation, improve decision-making, and gain competitive edge. AI’s revolutionary promise is within reach—but only for companies prepared to lay the groundwork for sustainable growth and long-term results.

(Photo by Unsplash)

See also: Microsoft tries to convert Google Chrome users

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Big tech’s AI spending hits new heights appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/big-tech-ai-spending-hits-new-heights/feed/ 0
Microsoft tries to convert Google Chrome users https://www.artificialintelligence-news.com/news/microsoft-tries-to-convert-google-chrome-users/ https://www.artificialintelligence-news.com/news/microsoft-tries-to-convert-google-chrome-users/#respond Fri, 15 Nov 2024 09:08:51 +0000 https://www.artificialintelligence-news.com/?p=16490 Microsoft Edge has evolved into more than simply a browser; it is a critical component of Microsoft’s ecosystem, meant to integrate smoothly with Windows and highlight the company’s latest innovations, such as its AI assistant, Copilot. While these interconnections make Edge a viable choice, Microsoft’s methods for persuading consumers to choose it have been far […]

The post Microsoft tries to convert Google Chrome users appeared first on AI News.

]]>
Microsoft Edge has evolved into more than simply a browser; it is a critical component of Microsoft’s ecosystem, meant to integrate smoothly with Windows and highlight the company’s latest innovations, such as its AI assistant, Copilot.

While these interconnections make Edge a viable choice, Microsoft’s methods for persuading consumers to choose it have been far from covert.

From default settings that prioritise Edge to persistent prompts at startup, Microsoft has made it clear they want Edge to be the go-to for Windows users. And lately, it’s upped the ante: now, Edge can launch automatically when your computer boots up, instantly nudging you to bring over your data from other browsers.

The most recent update includes an auto-checked option to import browsing data from Chrome, such as history, bookmarks, and open tabs, in the name of users leveraging the features of AI assistant, Copilot. Although AI features may be appealing to some, the aggressive approach has left many users feeling annoyed rather than tempted.

The Verge recently noticed that when you start up your PC, Edge might decide to open on its own, promptly displaying a pop-up for its AI assistant, Copilot. Right next to Copilot, there’s a conveniently checked box allowing Edge to import data from other browsers automatically. For some users, this seems like an overreach, raising doubts about how far Microsoft is ready to go to make Edge the browser of choice.

Microsoft has confirmed this setup and stated that customers have the option to opt-out. Still, with default settings that favour data imports and an eye-catching import button, it’s easy for users to unintentionally make the switch, especially if they’re not paying attention. For those who prefer sticking with their existing browsers without interruption, the approach can feel unwelcome.

But even if users dodge the pop-ups, Edge isn’t exactly shy. Uninstalling it is a complex process, and it often gets reinstalled by Windows updates, much to the frustration of users who would rather go without. For many, this persistence feels more like a forceful sales pitch rather than a friendly suggestion.

Interestingly, this isn’t the first time Microsoft has tried this type of strategy. A similar message appeared to users earlier this year but was pulled back after strong objections. Now, it’s back, with Microsoft’s Caitlin Roulston stating the notification is meant to “give users the choice to import data from other browsers.”

In fact, Microsoft’s bold tactics go back some years. In 2022, it introduced a feature that could automatically pull data from Chrome into Edge – although users had the option to decline. In 2021, the company made it practically impossible to set any browser other than Edge as the default, resulting in enough outcry for Microsoft to back down.

While Microsoft promotes its intrusive pop-ups as a way to give users more control, others who value choice without constant nudges. The relentless push for Edge usage could actually be detrimental, as the company’s persistence may drive users toward other browsers rather than away. To truly compete, Microsoft might benefit from letting Edge’s strengths speak for themselves rather than relying on aggressive prompts to change hearts and minds.

(Photo by Surface)

See also: EU probes Microsoft-OpenAI and Google-Samsung AI deals

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Microsoft tries to convert Google Chrome users appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/microsoft-tries-to-convert-google-chrome-users/feed/ 0