Development - AI News https://www.artificialintelligence-news.com/categories/development/ Artificial Intelligence News Thu, 01 May 2025 17:02:35 +0000 en-GB hourly 1 https://wordpress.org/?v=6.8.1 https://www.artificialintelligence-news.com/wp-content/uploads/2020/09/cropped-ai-icon-32x32.png Development - AI News https://www.artificialintelligence-news.com/categories/development/ 32 32 Claude Integrations: Anthropic adds AI to your favourite work tools https://www.artificialintelligence-news.com/news/claude-integrations-anthropic-adds-ai-favourite-work-tools/ https://www.artificialintelligence-news.com/news/claude-integrations-anthropic-adds-ai-favourite-work-tools/#respond Thu, 01 May 2025 17:02:33 +0000 https://www.artificialintelligence-news.com/?p=106258 Anthropic just launched ‘Integrations’ for Claude that enables the AI to talk directly to your favourite daily work tools. In addition, the company has launched a beefed-up ‘Advanced Research’ feature for digging deeper than ever before. Starting with Integrations, the feature builds on a technical standard Anthropic released last year (the Model Context Protocol, or […]

The post Claude Integrations: Anthropic adds AI to your favourite work tools appeared first on AI News.

]]>
Anthropic just launched ‘Integrations’ for Claude that enables the AI to talk directly to your favourite daily work tools. In addition, the company has launched a beefed-up ‘Advanced Research’ feature for digging deeper than ever before.

Starting with Integrations, the feature builds on a technical standard Anthropic released last year (the Model Context Protocol, or MCP), but makes it much easier to use. Before, setting this up was a bit technical and local. Now, developers can build secure bridges allowing Claude to connect safely with apps over the web or on your desktop.

For end-users of Claude, this means you can now hook it up to a growing list of popular work software. Right out of the gate, they’ve included support for ten big names: Atlassian’s Jira and Confluence (hello, project managers and dev teams!), the automation powerhouse Zapier, Cloudflare, customer comms tool Intercom, plus Asana, Square, Sentry, PayPal, Linear, and Plaid. Stripe and GitLab are joining the party soon.

So, what’s the big deal? The real advantage here is context. When Claude can see your project history in Jira, read your team’s knowledge base in Confluence, or check task updates in Asana, it stops guessing and starts understanding what you’re working on.

“When you connect your tools to Claude, it gains deep context about your work—understanding project histories, task statuses, and organisational knowledge—and can take actions across every surface,” explains Anthropic.

They add, “Claude becomes a more informed collaborator, helping you execute complex projects in one place with expert assistance at every step.”

Let’s look at what this means in practice. Connect Zapier, and you suddenly give Claude the keys to thousands of apps linked by Zapier’s workflows. You could just ask Claude, conversationally, to trigger a complex sequence – maybe grab the latest sales numbers from HubSpot, check your calendar, and whip up some meeting notes, all without you lifting a finger in those apps.

For teams using Atlassian’s Jira and Confluence, Claude could become a serious helper. Think drafting product specs, summarising long Confluence documents so you don’t have to wade through them, or even creating batches of linked Jira tickets at once. It might even spot potential roadblocks by analysing project data.

And if you use Intercom for customer chats, this integration could be a game-changer. Intercom’s own AI assistant, Fin, can now work with Claude to do things like automatically create a bug report in Linear if a customer flags an issue. You could also ask Claude to sift through your Intercom chat history to spot patterns, help debug tricky problems, or summarise what customers are saying – making the whole journey from feedback to fix much smoother.

Anthropic is also making it easier for developers to build even more of these connections. They reckon that using their tools (or platforms like Cloudflare that handle the tricky bits like security and setup), developers can whip up a custom Integration with Claude in about half an hour. This could mean connecting Claude to your company’s unique internal systems or specialised industry software.

Beyond tool integrations, Claude gets a serious research upgrade

Alongside these new connections, Anthropic has given Claude’s Research feature a serious boost. It could already search the web and your Google Workspace files, but the new ‘Advanced Research’ mode is built for when you need to dig really deep.

Flip the switch for this advanced mode, and Claude tackles big questions differently. Instead of just one big search, it intelligently breaks your request down into smaller chunks, investigates each part thoroughly – using the web, your Google Docs, and now tapping into any apps you’ve connected via Integrations – before pulling it all together into a detailed report.

Now, this deeper digging takes a bit more time. While many reports might only take five to fifteen minutes, Anthropic says the really complex investigations could have Claude working away for up to 45 minutes. That might sound like a while, but compare it to the hours you might spend grinding through that research manually, and it starts to look pretty appealing.

Importantly, you can trust the results. When Claude uses information from any source – whether it’s a website, an internal doc, a Jira ticket, or a Confluence page – it gives you clear links straight back to the original. No more wondering where the AI got its information from; you can check it yourself.

These shiny new Integrations and the Advanced Research mode are rolling out now in beta for folks on Anthropic’s paid Max, Team, and Enterprise plans. If you’re on the Pro plan, don’t worry – access is coming your way soon.

Also worth noting: the standard web search feature inside Claude is now available everywhere, for everyone on any paid Claude.ai plan (Pro and up). No more geographical restrictions on that front.

Putting it all together, these updates and integrations show Anthropic is serious about making Claude genuinely useful in a professional context. By letting it plug directly into the tools we already use and giving it more powerful ways to analyse information, they’re pushing Claude towards being less of a novelty and more of an essential part of the modern toolkit.

(Image credit: Anthropic)

See also: Baidu ERNIE X1 and 4.5 Turbo boast high performance at low cost

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Claude Integrations: Anthropic adds AI to your favourite work tools appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/claude-integrations-anthropic-adds-ai-favourite-work-tools/feed/ 0
Meta beefs up AI security with new Llama tools  https://www.artificialintelligence-news.com/news/meta-beefs-up-ai-security-new-llama-tools/ https://www.artificialintelligence-news.com/news/meta-beefs-up-ai-security-new-llama-tools/#respond Wed, 30 Apr 2025 13:35:22 +0000 https://www.artificialintelligence-news.com/?p=106233 If you’re building with AI, or trying to defend against the less savoury side of the technology, Meta just dropped new Llama security tools. The improved security tools for the Llama AI models arrive alongside fresh resources from Meta designed to help cybersecurity teams harness AI for defence. It’s all part of their push to […]

The post Meta beefs up AI security with new Llama tools  appeared first on AI News.

]]>
If you’re building with AI, or trying to defend against the less savoury side of the technology, Meta just dropped new Llama security tools.

The improved security tools for the Llama AI models arrive alongside fresh resources from Meta designed to help cybersecurity teams harness AI for defence. It’s all part of their push to make developing and using AI a bit safer for everyone involved.

Developers working with the Llama family of models now have some upgraded kit to play with. You can grab these latest Llama Protection tools directly from Meta’s own Llama Protections page, or find them where many developers live: Hugging Face and GitHub.

First up is Llama Guard 4. Think of it as an evolution of Meta’s customisable safety filter for AI. The big news here is that it’s now multimodal so it can understand and apply safety rules not just to text, but to images as well. That’s crucial as AI applications get more visual. This new version is also being baked into Meta’s brand-new Llama API, which is currently in a limited preview.

Then there’s LlamaFirewall. This is a new piece of the puzzle from Meta, designed to act like a security control centre for AI systems. It helps manage different safety models working together and hooks into Meta’s other protection tools. Its job? To spot and block the kind of risks that keep AI developers up at night – things like clever ‘prompt injection’ attacks designed to trick the AI, potentially dodgy code generation, or risky behaviour from AI plug-ins.

Meta has also given its Llama Prompt Guard a tune-up. The main Prompt Guard 2 (86M) model is now better at sniffing out those pesky jailbreak attempts and prompt injections. More interestingly, perhaps, is the introduction of Prompt Guard 2 22M.

Prompt Guard 2 22M is a much smaller, nippier version. Meta reckons it can slash latency and compute costs by up to 75% compared to the bigger model, without sacrificing too much detection power. For anyone needing faster responses or working on tighter budgets, that’s a welcome addition.

But Meta isn’t just focusing on the AI builders; they’re also looking at the cyber defenders on the front lines of digital security. They’ve heard the calls for better AI-powered tools to help in the fight against cyberattacks, and they’re sharing some updates aimed at just that.

The CyberSec Eval 4 benchmark suite has been updated. This open-source toolkit helps organisations figure out how good AI systems actually are at security tasks. This latest version includes two new tools:

  • CyberSOC Eval: Built with the help of cybersecurity experts CrowdStrike, this framework specifically measures how well AI performs in a real Security Operation Centre (SOC) environment. It’s designed to give a clearer picture of AI’s effectiveness in threat detection and response. The benchmark itself is coming soon.
  • AutoPatchBench: This benchmark tests how good Llama and other AIs are at automatically finding and fixing security holes in code before the bad guys can exploit them.

To help get these kinds of tools into the hands of those who need them, Meta is kicking off the Llama Defenders Program. This seems to be about giving partner companies and developers special access to a mix of AI solutions – some open-source, some early-access, some perhaps proprietary – all geared towards different security challenges.

As part of this, Meta is sharing an AI security tool they use internally: the Automated Sensitive Doc Classification Tool. It automatically slaps security labels on documents inside an organisation. Why? To stop sensitive info from walking out the door, or to prevent it from being accidentally fed into an AI system (like in RAG setups) where it could be leaked.

They’re also tackling the problem of fake audio generated by AI, which is increasingly used in scams. The Llama Generated Audio Detector and Llama Audio Watermark Detector are being shared with partners to help them spot AI-generated voices in potential phishing calls or fraud attempts. Companies like ZenDesk, Bell Canada, and AT&T are already lined up to integrate these.

Finally, Meta gave a sneak peek at something potentially huge for user privacy: Private Processing. This is new tech they’re working on for WhatsApp. The idea is to let AI do helpful things like summarise your unread messages or help you draft replies, but without Meta or WhatsApp being able to read the content of those messages.

Meta is being quite open about the security side, even publishing their threat model and inviting security researchers to poke holes in the architecture before it ever goes live. It’s a sign they know they need to get the privacy aspect right.

Overall, it’s a broad set of AI security announcements from Meta. They’re clearly trying to put serious muscle behind securing the AI they build, while also giving the wider tech community better tools to build safely and defend effectively.

See also: Alarming rise in AI-powered scams: Microsoft reveals $4B in thwarted fraud

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta beefs up AI security with new Llama tools  appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/meta-beefs-up-ai-security-new-llama-tools/feed/ 0
Baidu ERNIE X1 and 4.5 Turbo boast high performance at low cost https://www.artificialintelligence-news.com/news/baidu-ernie-x1-and-4-5-turbo-high-performance-low-cost/ https://www.artificialintelligence-news.com/news/baidu-ernie-x1-and-4-5-turbo-high-performance-low-cost/#respond Fri, 25 Apr 2025 12:28:01 +0000 https://www.artificialintelligence-news.com/?p=106047 Baidu has unveiled ERNIE X1 Turbo and 4.5 Turbo, two fast models that boast impressive performance alongside dramatic cost reductions. Developed as enhancements to the existing ERNIE X1 and 4.5 models, both new Turbo versions highlight multimodal processing, robust reasoning skills, and aggressive pricing strategies designed to capture developer interest and marketshare. Baidu ERNIE X1 […]

The post Baidu ERNIE X1 and 4.5 Turbo boast high performance at low cost appeared first on AI News.

]]>
Baidu has unveiled ERNIE X1 Turbo and 4.5 Turbo, two fast models that boast impressive performance alongside dramatic cost reductions.

Developed as enhancements to the existing ERNIE X1 and 4.5 models, both new Turbo versions highlight multimodal processing, robust reasoning skills, and aggressive pricing strategies designed to capture developer interest and marketshare.

Baidu ERNIE X1 Turbo: Deep reasoning meets cost efficiency

Positioned as a deep-thinking reasoning model, ERNIE X1 Turbo tackles complex tasks requiring sophisticated understanding. It enters a competitive field, claiming superior performance in some benchmarks against rivals like DeepSeek R1, V3, and OpenAI o1:

Benchmarks of Baidu ERNIE X1 Turbo compared to rival AI large language models like DeepSeek R1 and OpenAI o1.

Key to X1 Turbo’s enhanced capabilities is an advanced “chain of thought” process, enabling more structured and logical problem-solving.

Furthermore, ERNIE X1 Turbo boasts improved multimodal functions – the ability to understand and process information beyond just text, potentially including images or other data types – alongside refined tool utilisation abilities. This makes it particularly well-suited for nuanced applications such as literary creation, complex logical reasoning challenges, code generation, and intricate instruction following.

ERNIE X1 Turbo achieves this performance while undercutting competitor pricing. Input token costs start at $0.14 per million tokens, with output tokens priced at $0.55 per million. This pricing structure is approximately 25% of DeepSeek R1.

Baidu ERNIE 4.5 Turbo: Multimodal muscle at a fraction of the cost

Sharing the spotlight is ERNIE 4.5 Turbo, which focuses on delivering upgraded multimodal features and significantly faster response times compared to its non-Turbo counterpart. The emphasis here is on providing a versatile, responsive AI experience while slashing operational costs.

The model achieves an 80% price reduction compared to the original ERNIE 4.5 with input set at $0.11 per million tokens and output at $0.44 per million tokens. This represents roughly 40% of the cost of the latest version of DeepSeek V3, again highlighting a deliberate strategy to attract users through cost-effectiveness.

Performance benchmarks further bolster its credentials. In multiple tests evaluating both multimodal and text capabilities, Baidu ERNIE 4.5 Turbo outperforms OpenAI’s highly-regarded GPT-4o model. 

In multimodal capability assessments, ERNIE 4.5 Turbo achieved an average score of 77.68 to surpass GPT-4o’s score of 72.76 in the same tests.

Benchmarks of Baidu ERNIE 4.5 Turbo compared to rival AI large language models like DeepSeek R1 and OpenAI o1.

While benchmark results always require careful interpretation, this suggests ERNIE 4.5 Turbo is a serious contender for tasks involving an integrated understanding of different data types.

Baidu continues to shake up the AI marketplace

The launch of ERNIE X1 Turbo and 4.5 Turbo signifies a growing trend in the AI sector: the democratisation of high-end capabilities. While foundational models continue to push the boundaries of performance, there is increasing demand for models that balance power with accessibility and affordability.

By lowering the price points for models with sophisticated reasoning and multimodal features, the Baidu ERNIE Turbo series could enable a wider range of developers and businesses to integrate advanced AI into their applications.

This competitive pricing puts pressure on established players like OpenAI and Anthropic, as well as emerging competitors like DeepSeek, potentially leading to further price adjustments across the market.

(Image Credit: Alpha Photo under CC BY-NC 2.0 license)

See also: China’s MCP adoption: AI assistants that actually do things

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Baidu ERNIE X1 and 4.5 Turbo boast high performance at low cost appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/baidu-ernie-x1-and-4-5-turbo-high-performance-low-cost/feed/ 0
RAGEN: AI framework tackles LLM agent instability https://www.artificialintelligence-news.com/news/ragen-ai-framework-tackles-llm-agent-instability/ https://www.artificialintelligence-news.com/news/ragen-ai-framework-tackles-llm-agent-instability/#respond Thu, 24 Apr 2025 16:06:47 +0000 https://www.artificialintelligence-news.com/?p=106040 Researchers have introduced RAGEN, an AI framework designed to counter LLM agent instability when handling complex situations. Training these AI agents presents significant hurdles, particularly when decisions span multiple steps and involve unpredictable feedback from the environment. While reinforcement learning (RL) has shown promise in static tasks like solving maths problems or generating code, its […]

The post RAGEN: AI framework tackles LLM agent instability appeared first on AI News.

]]>
Researchers have introduced RAGEN, an AI framework designed to counter LLM agent instability when handling complex situations.

Training these AI agents presents significant hurdles, particularly when decisions span multiple steps and involve unpredictable feedback from the environment. While reinforcement learning (RL) has shown promise in static tasks like solving maths problems or generating code, its application to dynamic, multi-turn agent training has been less explored.   

Addressing this gap, a collaborative team from institutions including Northwestern University, Stanford University, Microsoft, and New York University has proposed StarPO (State-Thinking-Actions-Reward Policy Optimisation).

StarPO offers a generalised approach for training agents at the trajectory level (i.e. it optimises the entire sequence of interactions, not just individual actions.)

Accompanying this is RAGEN, a modular system built to implement StarPO. This enables the training and evaluation of LLM agents, particularly focusing on their reasoning capabilities under RL. RAGEN provides the necessary infrastructure for rollouts, reward assignment, and optimisation within multi-turn, stochastic (randomly determined) environments.

Minimalist environments, maximum insight

To isolate the core learning challenges from confounding factors like extensive pre-existing knowledge or task-specific engineering, the researchers tested LLMs using RAGEN in three deliberately minimalistic, controllable symbolic gaming environments:   

  1. Bandit: A single-turn, stochastic task testing risk-sensitive symbolic reasoning. The agent chooses between options (like ‘Phoenix’ or ‘Dragon’ arms) with different, initially unknown, reward profiles.
  2. Sokoban: A multi-turn, deterministic puzzle requiring foresight and planning, as actions (pushing boxes) are irreversible.
  3. Frozen Lake: A multi-turn, stochastic grid navigation task where movement attempts can randomly fail, demanding planning under uncertainty.

These environments allow for clear analysis of how agents learn decision-making policies purely through interaction.   

Key findings: Stability, rollouts, and reasoning

The study yielded three significant findings concerning the training of self-evolving LLM agents:

The ‘Echo Trap’ and the need for stability

A recurring problem observed during multi-turn RL training was dubbed the “Echo Trap”. Agents would initially improve but then suffer performance collapse, overfitting to locally rewarded reasoning patterns. 

This was marked by collapsing reward variance, falling entropy (a measure of randomness/exploration), and sudden spikes in gradients (indicating training instability). Early signs included drops in reward standard deviation and output entropy.   

To combat this, the team developed StarPO-S, a stabilised version of the framework. StarPO-S incorporates:   

  • Variance-based trajectory filtering: Focusing training on task instances where the agent’s behaviour shows higher uncertainty (higher reward variance), discarding low-variance, less informative rollouts. This improved stability and efficiency.   
  • Critic incorporation: Using methods like PPO (Proximal Policy Optimisation), which employ a ‘critic’ to estimate value, generally showed better stability than critic-free methods like GRPO (Group Relative Policy Optimisation) in most tests.   
  • Decoupled clipping and KL removal: Techniques adapted from other research (DAPO) involving asymmetric clipping (allowing more aggressive learning from positive rewards) and removing KL divergence penalties (encouraging exploration) further boosted stability and performance.   

StarPO-S consistently delayed collapse and improved final task performance compared to vanilla StarPO.   

Rollout quality is crucial

The characteristics of the ‘rollouts’ (simulated interaction trajectories used for training) significantly impact learning. Key factors identified include:   

  • Task diversity: Training with a diverse set of initial states (prompts), but with multiple responses generated per prompt, aids generalisation. The sweet spot seemed to be moderate diversity enabling contrast between different outcomes in similar scenarios.   
  • Interaction granularity: Allowing multiple actions per turn (around 5-6 proved optimal) enables better planning within a fixed turn limit, without introducing the noise associated with excessively long action sequences.   
  • Rollout frequency: Using fresh, up-to-date rollouts that reflect the agent’s current policy is vital. More frequent sampling (approaching an ‘online’ setting) leads to faster convergence and better generalisation by reducing policy-data mismatch.

Maintaining freshness, alongside appropriate action budgets and task diversity, is key for stable training.   

Reasoning requires careful reward design

Simply prompting models to ‘think’ doesn’t guarantee meaningful reasoning emerges, especially in multi-turn tasks. The study found:

  • Reasoning traces helped generalisation in the simpler, single-turn Bandit task, even when symbolic cues conflicted with rewards.   
  • In multi-turn tasks like Sokoban, reasoning benefits were limited, and the length of ‘thinking’ segments consistently declined during training. Agents often regressed to direct action selection or produced “hallucinated reasoning” if rewards only tracked task success, revealing a “mismatch between thoughts and environment states.”

This suggests that standard trajectory-level rewards (often sparse and outcome-based) are insufficient. 

“Without fine-grained, reasoning-aware reward signals, agent reasoning hardly emerge[s] through multi-turn RL.”

The researchers propose that future work should explore rewards that explicitly evaluate the quality of intermediate reasoning steps, perhaps using format-based penalties or rewarding explanation quality, rather than just final outcomes.   

RAGEN and StarPO: A step towards self-evolving AI

The RAGEN system and StarPO framework represent a step towards training LLM agents that can reason and adapt through interaction in complex, unpredictable environments.

This research highlights the unique stability challenges posed by multi-turn RL and offers concrete strategies – like StarPO-S’s filtering and stabilisation techniques – to mitigate them. It also underscores the critical role of rollout generation strategies and the need for more sophisticated reward mechanisms to cultivate genuine reasoning, rather than superficial strategies or hallucinations.

While acknowledging limitations – including the need to test on larger models and optimise for domains without easily verifiable rewards – the work opens “a scalable and principled path for building AI systems” in areas demanding complex interaction and verifiable outcomes, such as theorem proving, software engineering, and scientific discovery.

(Image by Gerd Altmann)

See also: How does AI judge? Anthropic studies the values of Claude

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post RAGEN: AI framework tackles LLM agent instability appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/ragen-ai-framework-tackles-llm-agent-instability/feed/ 0
Meta FAIR advances human-like AI with five major releases https://www.artificialintelligence-news.com/news/meta-fair-advances-human-like-ai-five-major-releases/ https://www.artificialintelligence-news.com/news/meta-fair-advances-human-like-ai-five-major-releases/#respond Thu, 17 Apr 2025 16:00:05 +0000 https://www.artificialintelligence-news.com/?p=105371 The Fundamental AI Research (FAIR) team at Meta has announced five projects advancing the company’s pursuit of advanced machine intelligence (AMI). The latest releases from Meta focus heavily on enhancing AI perception – the ability for machines to process and interpret sensory information – alongside advancements in language modelling, robotics, and collaborative AI agents. Meta […]

The post Meta FAIR advances human-like AI with five major releases appeared first on AI News.

]]>
The Fundamental AI Research (FAIR) team at Meta has announced five projects advancing the company’s pursuit of advanced machine intelligence (AMI).

The latest releases from Meta focus heavily on enhancing AI perception – the ability for machines to process and interpret sensory information – alongside advancements in language modelling, robotics, and collaborative AI agents.

Meta stated its goal involves creating machines “that are able to acquire, process, and interpret sensory information about the world around us and are able to use this information to make decisions with human-like intelligence and speed.”

The five new releases represent diverse but interconnected efforts towards achieving this ambitious goal.

Perception Encoder: Meta sharpens the ‘vision’ of AI

Central to the new releases is the Perception Encoder, described as a large-scale vision encoder designed to excel across various image and video tasks.

Vision encoders function as the “eyes” for AI systems, allowing them to understand visual data.

Meta highlights the increasing challenge of building encoders that meet the demands of advanced AI, requiring capabilities that bridge vision and language, handle both images and videos effectively, and remain robust under challenging conditions, including potential adversarial attacks.

The ideal encoder, according to Meta, should recognise a wide array of concepts while distinguishing subtle details—citing examples like spotting “a stingray burrowed under the sea floor, identifying a tiny goldfinch in the background of an image, or catching a scampering agouti on a night vision wildlife camera.”

Meta claims the Perception Encoder achieves “exceptional performance on image and video zero-shot classification and retrieval, surpassing all existing open source and proprietary models for such tasks.”

Furthermore, its perceptual strengths reportedly translate well to language tasks. 

When aligned with a large language model (LLM), the encoder is said to outperform other vision encoders in areas like visual question answering (VQA), captioning, document understanding, and grounding (linking text to specific image regions). It also reportedly boosts performance on tasks traditionally difficult for LLMs, such as understanding spatial relationships (e.g., “if one object is behind another”) or camera movement relative to an object.

“As Perception Encoder begins to be integrated into new applications, we’re excited to see how its advanced vision capabilities will enable even more capable AI systems,” Meta said.

Perception Language Model (PLM): Open research in vision-language

Complementing the encoder is the Perception Language Model (PLM), an open and reproducible vision-language model aimed at complex visual recognition tasks. 

PLM was trained using large-scale synthetic data combined with open vision-language datasets, explicitly without distilling knowledge from external proprietary models.

Recognising gaps in existing video understanding data, the FAIR team collected 2.5 million new, human-labelled samples focused on fine-grained video question answering and spatio-temporal captioning. Meta claims this forms the “largest dataset of its kind to date.”

PLM is offered in 1, 3, and 8 billion parameter versions, catering to academic research needs requiring transparency.

Alongside the models, Meta is releasing PLM-VideoBench, a new benchmark specifically designed to test capabilities often missed by existing benchmarks, namely “fine-grained activity understanding and spatiotemporally grounded reasoning.”

Meta hopes the combination of open models, the large dataset, and the challenging benchmark will empower the open-source community.

Meta Locate 3D: Giving robots situational awareness

Bridging the gap between language commands and physical action is Meta Locate 3D. This end-to-end model aims to allow robots to accurately localise objects in a 3D environment based on open-vocabulary natural language queries.

Meta Locate 3D processes 3D point clouds directly from RGB-D sensors (like those found on some robots or depth-sensing cameras). Given a textual prompt, such as “flower vase near TV console,” the system considers spatial relationships and context to pinpoint the correct object instance, distinguishing it from, say, a “vase on the table.”

The system comprises three main parts: a preprocessing step converting 2D features to 3D featurised point clouds; the 3D-JEPA encoder (a pretrained model creating a contextualised 3D world representation); and the Locate 3D decoder, which takes the 3D representation and the language query to output bounding boxes and masks for the specified objects.

Alongside the model, Meta is releasing a substantial new dataset for object localisation based on referring expressions. It includes 130,000 language annotations across 1,346 scenes from the ARKitScenes, ScanNet, and ScanNet++ datasets, effectively doubling existing annotated data in this area.

Meta sees this technology as crucial for developing more capable robotic systems, including its own PARTNR robot project, enabling more natural human-robot interaction and collaboration.

Dynamic Byte Latent Transformer: Efficient and robust language modelling

Following research published in late 2024, Meta is now releasing the model weights for its 8-billion parameter Dynamic Byte Latent Transformer.

This architecture represents a shift away from traditional tokenisation-based language models, operating instead at the byte level. Meta claims this approach achieves comparable performance at scale while offering significant improvements in inference efficiency and robustness.

Traditional LLMs break text into ‘tokens’, which can struggle with misspellings, novel words, or adversarial inputs. Byte-level models process raw bytes, potentially offering greater resilience.

Meta reports that the Dynamic Byte Latent Transformer “outperforms tokeniser-based models across various tasks, with an average robustness advantage of +7 points (on perturbed HellaSwag), and reaching as high as +55 points on tasks from the CUTE token-understanding benchmark.”

By releasing the weights alongside the previously shared codebase, Meta encourages the research community to explore this alternative approach to language modelling.

Collaborative Reasoner: Meta advances socially-intelligent AI agents

The final release, Collaborative Reasoner, tackles the complex challenge of creating AI agents that can effectively collaborate with humans or other AIs.

Meta notes that human collaboration often yields superior results, and aims to imbue AI with similar capabilities for tasks like helping with homework or job interview preparation.

Such collaboration requires not just problem-solving but also social skills like communication, empathy, providing feedback, and understanding others’ mental states (theory-of-mind), often unfolding over multiple conversational turns.

Current LLM training and evaluation methods often neglect these social and collaborative aspects. Furthermore, collecting relevant conversational data is expensive and difficult.

Collaborative Reasoner provides a framework to evaluate and enhance these skills. It includes goal-oriented tasks requiring multi-step reasoning achieved through conversation between two agents. The framework tests abilities like disagreeing constructively, persuading a partner, and reaching a shared best solution.

Meta’s evaluations revealed that current models struggle to consistently leverage collaboration for better outcomes. To address this, they propose a self-improvement technique using synthetic interaction data where an LLM agent collaborates with itself.

Generating this data at scale is enabled by a new high-performance model serving engine called Matrix. Using this approach on maths, scientific, and social reasoning tasks reportedly yielded improvements of up to 29.4% compared to the standard ‘chain-of-thought’ performance of a single LLM.

By open-sourcing the data generation and modelling pipeline, Meta aims to foster further research into creating truly “social agents that can partner with humans and other agents.”

These five releases collectively underscore Meta’s continued heavy investment in fundamental AI research, particularly focusing on building blocks for machines that can perceive, understand, and interact with the world in more human-like ways. 

See also: Meta will train AI models using EU user data

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta FAIR advances human-like AI with five major releases appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/meta-fair-advances-human-like-ai-five-major-releases/feed/ 0
Meta will train AI models using EU user data https://www.artificialintelligence-news.com/news/meta-will-train-ai-models-using-eu-user-data/ https://www.artificialintelligence-news.com/news/meta-will-train-ai-models-using-eu-user-data/#respond Tue, 15 Apr 2025 16:32:02 +0000 https://www.artificialintelligence-news.com/?p=105325 Meta has confirmed plans to utilise content shared by its adult users in the EU (European Union) to train its AI models. The announcement follows the recent launch of Meta AI features in Europe and aims to enhance the capabilities and cultural relevance of its AI systems for the region’s diverse population.    In a statement, […]

The post Meta will train AI models using EU user data appeared first on AI News.

]]>
Meta has confirmed plans to utilise content shared by its adult users in the EU (European Union) to train its AI models.

The announcement follows the recent launch of Meta AI features in Europe and aims to enhance the capabilities and cultural relevance of its AI systems for the region’s diverse population.   

In a statement, Meta wrote: “Today, we’re announcing our plans to train AI at Meta using public content – like public posts and comments – shared by adults on our products in the EU.

“People’s interactions with Meta AI – like questions and queries – will also be used to train and improve our models.”

Starting this week, users of Meta’s platforms (including Facebook, Instagram, WhatsApp, and Messenger) within the EU will receive notifications explaining the data usage. These notifications, delivered both in-app and via email, will detail the types of public data involved and link to an objection form.

“We have made this objection form easy to find, read, and use, and we’ll honor all objection forms we have already received, as well as newly submitted ones,” Meta explained.

Meta explicitly clarified that certain data types remain off-limits for AI training purposes.

The company says it will not “use people’s private messages with friends and family” to train its generative AI models. Furthermore, public data associated with accounts belonging to users under the age of 18 in the EU will not be included in the training datasets.

Meta wants to build AI tools designed for EU users

Meta positions this initiative as a necessary step towards creating AI tools designed for EU users. Meta launched its AI chatbot functionality across its messaging apps in Europe last month, framing this data usage as the next phase in improving the service.

“We believe we have a responsibility to build AI that’s not just available to Europeans, but is actually built for them,” the company explained. 

“That means everything from dialects and colloquialisms, to hyper-local knowledge and the distinct ways different countries use humor and sarcasm on our products.”

This becomes increasingly pertinent as AI models evolve with multi-modal capabilities spanning text, voice, video, and imagery.   

Meta also situated its actions in the EU within the broader industry landscape, pointing out that training AI on user data is common practice.

“It’s important to note that the kind of AI training we’re doing is not unique to Meta, nor will it be unique to Europe,” the statement reads. 

“We’re following the example set by others including Google and OpenAI, both of which have already used data from European users to train their AI models.”

Meta further claimed its approach surpasses others in openness, stating, “We’re proud that our approach is more transparent than many of our industry counterparts.”   

Regarding regulatory compliance, Meta referenced prior engagement with regulators, including a delay initiated last year while awaiting clarification on legal requirements. The company also cited a favourable opinion from the European Data Protection Board (EDPB) in December 2024.

“We welcome the opinion provided by the EDPB in December, which affirmed that our original approach met our legal obligations,” wrote Meta.

Broader concerns over AI training data

While Meta presents its approach in the EU as transparent and compliant, the practice of using vast swathes of public user data from social media platforms to train large language models (LLMs) and generative AI continues to raise significant concerns among privacy advocates.

Firstly, the definition of “public” data can be contentious. Content shared publicly on platforms like Facebook or Instagram may not have been posted with the expectation that it would become raw material for training commercial AI systems capable of generating entirely new content or insights. Users might share personal anecdotes, opinions, or creative works publicly within their perceived community, without envisaging its large-scale, automated analysis and repurposing by the platform owner.

Secondly, the effectiveness and fairness of an “opt-out” system versus an “opt-in” system remain debatable. Placing the onus on users to actively object, often after receiving notifications buried amongst countless others, raises questions about informed consent. Many users may not see, understand, or act upon the notification, potentially leading to their data being used by default rather than explicit permission.

Thirdly, the issue of inherent bias looms large. Social media platforms reflect and sometimes amplify societal biases, including racism, sexism, and misinformation. AI models trained on this data risk learning, replicating, and even scaling these biases. While companies employ filtering and fine-tuning techniques, eradicating bias absorbed from billions of data points is an immense challenge. An AI trained on European public data needs careful curation to avoid perpetuating stereotypes or harmful generalisations about the very cultures it aims to understand.   

Furthermore, questions surrounding copyright and intellectual property persist. Public posts often contain original text, images, and videos created by users. Using this content to train commercial AI models, which may then generate competing content or derive value from it, enters murky legal territory regarding ownership and fair compensation—issues currently being contested in courts worldwide involving various AI developers.

Finally, while Meta highlights its transparency relative to competitors, the actual mechanisms of data selection, filtering, and its specific impact on model behaviour often remain opaque. Truly meaningful transparency would involve deeper insights into how specific data influences AI outputs and the safeguards in place to prevent misuse or unintended consequences.

The approach taken by Meta in the EU underscores the immense value technology giants place on user-generated content as fuel for the burgeoning AI economy. As these practices become more widespread, the debate surrounding data privacy, informed consent, algorithmic bias, and the ethical responsibilities of AI developers will undoubtedly intensify across Europe and beyond.

(Photo by Julio Lopez)

See also: Apple AI stresses privacy with synthetic and anonymised data

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Meta will train AI models using EU user data appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/meta-will-train-ai-models-using-eu-user-data/feed/ 0
BCG: Analysing the geopolitics of generative AI https://www.artificialintelligence-news.com/news/bcg-analysing-the-geopolitics-of-generative-ai/ https://www.artificialintelligence-news.com/news/bcg-analysing-the-geopolitics-of-generative-ai/#respond Fri, 11 Apr 2025 16:11:17 +0000 https://www.artificialintelligence-news.com/?p=105294 Generative AI is reshaping global competition and geopolitics, presenting challenges and opportunities for nations and businesses alike. Senior figures from Boston Consulting Group (BCG) and its tech division, BCG X, discussed the intricate dynamics of the global AI race, the dominance of superpowers like the US and China, the role of emerging “middle powers,” and […]

The post BCG: Analysing the geopolitics of generative AI appeared first on AI News.

]]>
Generative AI is reshaping global competition and geopolitics, presenting challenges and opportunities for nations and businesses alike.

Senior figures from Boston Consulting Group (BCG) and its tech division, BCG X, discussed the intricate dynamics of the global AI race, the dominance of superpowers like the US and China, the role of emerging “middle powers,” and the implications for multinational corporations.

AI investments expose businesses to increasingly tense geopolitics

Sylvain Duranton, Global Leader at BCG X, noted the significant geopolitical risk companies face: “For large companies, close to half of them, 44%, have teams around the world, not just in one country where their headquarters are.”

Sylvain Duranton, Global Leader at BCG X

Many of these businesses operate across numerous countries, making them vulnerable to differing regulations and sovereignty issues. “They’ve built their AI teams and ecosystem far before there was such tension around the world.”

Duranton also pointed to the stark imbalance in the AI supply race, particularly in investment.

Comparing the market capitalisation of tech companies, the US dwarfs Europe by a factor of 20 and the Asia Pacific region by five. Investment figures paint a similar picture, showing a “completely disproportionate” imbalance compared to the relative sizes of the economies.

This AI race is fuelled by massive investments in compute power, frontier models, and the emergence of lighter, open-weight models changing the competitive dynamic.   

Benchmarking national AI capabilities

Nikolaus Lang, Global Leader at the BCG Henderson Institute – BCG’s think tank – detailed the extensive research undertaken to benchmark national GenAI capabilities objectively.

The team analysed the “upstream of GenAI,” focusing on large language model (LLM) development and its six key enablers: capital, computing power, intellectual property, talent, data, and energy.

Using hard data like AI researcher numbers, patents, data centre capacity, and VC investment, they created a comparative analysis. Unsurprisingly, the analysis revealed the US and China as the clear AI frontrunners and maintain leads in geopolitics.

Nikolaus Lang, Global Leader at the BCG Henderson Institute

The US boasts the largest pool of AI specialists (around half a million), immense capital power ($303bn in VC funding, $212bn in tech R&D), and leading compute power (45 GW).

Lang highlighted America’s historical dominance, noting, “the US has been the largest producer of notable AI models with 67%” since 1950, a lead reflected in today’s LLM landscape. This strength is reinforced by “outsized capital power” and strategic restrictions on advanced AI chip access through frameworks like the US AI Diffusion Framework.   

China, the second AI superpower, shows particular strength in data—ranking highly in e-governance and mobile broadband subscriptions, alongside significant data centre capacity (20 GW) and capital power. 

Despite restricted access to the latest chips, Chinese LLMs are rapidly closing the gap with US models. Lang mentioned the emergence of models like DeepSpeech as evidence of this trend, achieved with smaller teams, fewer GPU hours, and previous-generation chips.

China’s progress is also fuelled by heavy investment in AI academic institutions (hosting 45 of the world’s top 100), a leading position in AI patent applications, and significant government-backed VC funding. Lang predicts “governments will play an important role in funding AI work going forward.”

The middle powers: Europe, Middle East, and Asia

Beyond the superpowers, several “middle powers” are carving out niches.

  • EU: While trailing the US and China, the EU holds the third spot with significant data centre capacity (8 GW) and the world’s second-largest AI talent pool (275,000 specialists) when capabilities are combined. Europe also leads in top AI publications. Lang stressed the need for bundled capacities, suggesting AI, defence, and renewables are key areas for future EU momentum.
  • Middle East (UAE & Saudi Arabia): These nations leverage strong capital power via sovereign wealth funds and competitively low electricity prices to attract talent and build compute power, aiming to become AI drivers “from scratch”. They show positive dynamics in attracting AI specialists and are climbing the ranks in AI publications.   
  • Asia (Japan & South Korea): Leveraging strong existing tech ecosystems in hardware and gaming, these countries invest heavily in R&D (around $207bn combined by top tech firms). Government support, particularly in Japan, fosters both supply and demand. Local LLMs and strategic investments by companies like Samsung and SoftBank demonstrate significant activity.   
  • Singapore: Singapore is boosting its AI ecosystem by focusing on talent upskilling programmes, supporting Southeast Asia’s first LLM, ensuring data centre capacity, and fostering adoption through initiatives like establishing AI centres of excellence.   

The geopolitics of generative AI: Strategy and sovereignty

The geopolitics of generative AI is being shaped by four clear dynamics: the US retains its lead, driven by an unrivalled tech ecosystem; China is rapidly closing the gap; middle powers face a strategic choice between building supply or accelerating adoption; and government funding is set to play a pivotal role, particularly as R&D costs climb and commoditisation sets in.

As geopolitical tensions mount, businesses are likely to diversify their GenAI supply chains to spread risk. The race ahead will be defined by how nations and companies navigate the intersection of innovation, policy, and resilience.

(Photo by Markus Krisetya)

See also: OpenAI counter-sues Elon Musk for attempts to ‘take down’ AI rival

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post BCG: Analysing the geopolitics of generative AI appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/bcg-analysing-the-geopolitics-of-generative-ai/feed/ 0
OpenAI counter-sues Elon Musk for attempts to ‘take down’ AI rival https://www.artificialintelligence-news.com/news/openai-counter-sues-elon-musk-attempts-take-down-ai-rival/ https://www.artificialintelligence-news.com/news/openai-counter-sues-elon-musk-attempts-take-down-ai-rival/#respond Thu, 10 Apr 2025 12:05:31 +0000 https://www.artificialintelligence-news.com/?p=105285 OpenAI has launched a legal counteroffensive against one of its co-founders, Elon Musk, and his competing AI venture, xAI. In court documents filed yesterday, OpenAI accuses Musk of orchestrating a “relentless” and “malicious” campaign designed to “take down OpenAI” after he left the organisation years ago. The court filing, submitted to the US District Court […]

The post OpenAI counter-sues Elon Musk for attempts to ‘take down’ AI rival appeared first on AI News.

]]>
OpenAI has launched a legal counteroffensive against one of its co-founders, Elon Musk, and his competing AI venture, xAI.

In court documents filed yesterday, OpenAI accuses Musk of orchestrating a “relentless” and “malicious” campaign designed to “take down OpenAI” after he left the organisation years ago.

The court filing, submitted to the US District Court for the Northern District of California, alleges Musk could not tolerate OpenAI’s success after he had “abandoned and declared [it] doomed.”

OpenAI is now seeking legal remedies, including an injunction to stop Musk’s alleged “unlawful and unfair action” and compensation for damages already caused.   

Origin story of OpenAI and the departure of Elon Musk

The legal documents recount OpenAI’s origins in 2015, stemming from an idea discussed by current CEO Sam Altman and President Greg Brockman to create an AI lab focused on developing artificial general intelligence (AGI) – AI capable of outperforming humans – for the “benefit of all humanity.”

Musk was involved in the launch, serving on the initial non-profit board and pledging $1 billion in donations.   

However, the relationship fractured. OpenAI claims that between 2017 and 2018, Musk’s demands for “absolute control” of the enterprise – or its potential absorption into Tesla – were rebuffed by Altman, Brockman, and then-Chief Scientist Ilya Sutskever. The filing quotes Sutskever warning Musk against creating an “AGI dictatorship.”

Following this disagreement, OpenAI alleges Elon Musk quit in February 2018, declaring the venture would fail without him and that he would pursue AGI development at Tesla instead. Critically, OpenAI contends the pledged $1 billion “was never satisfied—not even close”.   

Restructuring, success, and Musk’s alleged ‘malicious’ campaign

Facing escalating costs for computing power and talent retention, OpenAI restructured and created a “capped-profit” entity in 2019 to attract investment while remaining controlled by the non-profit board and bound by its mission. This structure, OpenAI states, was announced publicly and Musk was offered equity in the new entity but declined and raised no objection at the time.   

OpenAI highlights its subsequent breakthroughs – including GPT-3, ChatGPT, and GPT-4 – achieved massive public adoption and critical acclaim. These successes, OpenAI emphasises, were made after the departure of Elon Musk and allegedly spurred his antagonism.

The filing details a chronology of alleged actions by Elon Musk aimed at harming OpenAI:   

  • Founding xAI: Musk “quietly created” his competitor, xAI, in March 2023.   
  • Moratorium call: Days later, Musk supported a call for a development moratorium on AI more advanced than GPT-4, a move OpenAI claims was intended “to stall OpenAI while all others, most notably Musk, caught up”.   
  • Records demand: Musk allegedly made a “pretextual demand” for confidential OpenAI documents, feigning concern while secretly building xAI.   
  • Public attacks: Using his social media platform X (formerly Twitter), Musk allegedly broadcast “press attacks” and “malicious campaigns” to his vast following, labelling OpenAI a “lie,” “evil,” and a “total scam”.   
  • Legal actions: Musk filed lawsuits, first in state court (later withdrawn) and then the current federal action, based on what OpenAI dismisses as meritless claims of a “Founding Agreement” breach.   
  • Regulatory pressure: Musk allegedly urged state Attorneys General to investigate OpenAI and force an asset auction.   
  • “Sham bid”: In February 2025, a Musk-led consortium made a purported $97.375 billion offer for OpenAI, Inc.’s assets. OpenAI derides this as a “sham bid” and a “stunt” lacking evidence of financing and designed purely to disrupt OpenAI’s operations, potential restructuring, fundraising, and relationships with investors and employees, particularly as OpenAI considers evolving its capped-profit arm into a Public Benefit Corporation (PBC). One investor involved allegedly admitted the bid’s aim was to gain “discovery”.   

Based on these allegations, OpenAI asserts two primary counterclaims against both Elon Musk and xAI:

  • Unfair competition: Alleging the “sham bid” constitutes an unfair and fraudulent business practice under California law, intended to disrupt OpenAI and gain an unfair advantage for xAI.   
  • Tortious interference with prospective economic advantage: Claiming the sham bid intentionally disrupted OpenAI’s existing and potential relationships with investors, employees, and customers. 

OpenAI argues Musk’s actions have forced it to divert resources and expend funds, causing harm. They claim his campaign threatens “irreparable harm” to their mission, governance, and crucial business relationships. The filing also touches upon concerns regarding xAI’s own safety record, citing reports of its AI Grok generating harmful content and misinformation.

The counterclaims mark a dramatic escalation in the legal battle between the AI pioneer and its departed co-founder. While Elon Musk initially sued OpenAI alleging a betrayal of its founding non-profit, open-source principles, OpenAI now contends Musk’s actions are a self-serving attempt to undermine a competitor he couldn’t control.

With billions at stake and the future direction of AGI in the balance, this dispute is far from over.

See also: Deep Cogito open LLMs use IDA to outperform same size models

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post OpenAI counter-sues Elon Musk for attempts to ‘take down’ AI rival appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/openai-counter-sues-elon-musk-attempts-take-down-ai-rival/feed/ 0
Deep Cogito open LLMs use IDA to outperform same size models https://www.artificialintelligence-news.com/news/deep-cogito-open-llms-use-ida-outperform-same-size-models/ https://www.artificialintelligence-news.com/news/deep-cogito-open-llms-use-ida-outperform-same-size-models/#respond Wed, 09 Apr 2025 08:03:15 +0000 https://www.artificialintelligence-news.com/?p=105246 Deep Cogito has released several open large language models (LLMs) that outperform competitors and claim to represent a step towards achieving general superintelligence. The San Francisco-based company, which states its mission is “building general superintelligence,” has launched preview versions of LLMs in 3B, 8B, 14B, 32B, and 70B parameter sizes. Deep Cogito asserts that “each […]

The post Deep Cogito open LLMs use IDA to outperform same size models appeared first on AI News.

]]>
Deep Cogito has released several open large language models (LLMs) that outperform competitors and claim to represent a step towards achieving general superintelligence.

The San Francisco-based company, which states its mission is “building general superintelligence,” has launched preview versions of LLMs in 3B, 8B, 14B, 32B, and 70B parameter sizes. Deep Cogito asserts that “each model outperforms the best available open models of the same size, including counterparts from LLAMA, DeepSeek, and Qwen, across most standard benchmarks”.

Impressively, the 70B model from Deep Cogito even surpasses the performance of the recently released Llama 4 109B Mixture-of-Experts (MoE) model.   

Iterated Distillation and Amplification (IDA)

Central to this release is a novel training methodology called Iterated Distillation and Amplification (IDA). 

Deep Cogito describes IDA as “a scalable and efficient alignment strategy for general superintelligence using iterative self-improvement”. This technique aims to overcome the inherent limitations of current LLM training paradigms, where model intelligence is often capped by the capabilities of larger “overseer” models or human curators.

The IDA process involves two key steps iterated repeatedly:

  • Amplification: Using more computation to enable the model to derive better solutions or capabilities, akin to advanced reasoning techniques.
  • Distillation: Internalising these amplified capabilities back into the model’s parameters.

Deep Cogito says this creates a “positive feedback loop” where model intelligence scales more directly with computational resources and the efficiency of the IDA process, rather than being strictly bounded by overseer intelligence.

“When we study superintelligent systems,” the research notes, referencing successes like AlphaGo, “we find two key ingredients enabled this breakthrough: Advanced Reasoning and Iterative Self-Improvement”. IDA is presented as a way to integrate both into LLM training.

Deep Cogito claims IDA is efficient, stating the new models were developed by a small team in approximately 75 days. They also highlight IDA’s potential scalability compared to methods like Reinforcement Learning from Human Feedback (RLHF) or standard distillation from larger models.

As evidence, the company points to their 70B model outperforming Llama 3.3 70B (distilled from a 405B model) and Llama 4 Scout 109B (distilled from a 2T parameter model).

Capabilities and performance of Deep Cogito models

The newly released Cogito models – based on Llama and Qwen checkpoints – are optimised for coding, function calling, and agentic use cases.

A key feature is their dual functionality: “Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models),” similar to capabilities seen in models like Claude 3.5. However, Deep Cogito notes they “have not optimised for very long reasoning chains,” citing user preference for faster answers and the efficiency of distilling shorter chains.

Extensive benchmark results are provided, comparing Cogito models against size-equivalent state-of-the-art open models in both direct (standard) and reasoning modes.

Across various benchmarks (MMLU, MMLU-Pro, ARC, GSM8K, MATH, etc.) and model sizes (3B, 8B, 14B, 32B, 70B,) the Cogito models generally show significant performance gains over counterparts like Llama 3.1/3.2/3.3 and Qwen 2.5, particularly in reasoning mode.

For instance, the Cogito 70B model achieves 91.73% on MMLU in standard mode (+6.40% vs Llama 3.3 70B) and 91.00% in thinking mode (+4.40% vs Deepseek R1 Distill 70B). Livebench scores also show improvements.

Here are benchmarks of 14B models for a medium-sized comparison:

Benchmark comparison of medium 14B size large language models from Deep Cogito compared to Alibaba Qwen and DeepSeek R1

While acknowledging benchmarks don’t fully capture real-world utility, Deep Cogito expresses confidence in practical performance.

This release is labelled a preview, with Deep Cogito stating they are “still in the early stages of this scaling curve”. They plan to release improved checkpoints for the current sizes and introduce larger MoE models (109B, 400B, 671B) “in the coming weeks / months”. All future models will also be open-source.

(Photo by Pietro Mattia)

See also: Alibaba Cloud targets global AI growth with new models and tools

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Deep Cogito open LLMs use IDA to outperform same size models appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/deep-cogito-open-llms-use-ida-outperform-same-size-models/feed/ 0
Alibaba Cloud targets global AI growth with new models and tools https://www.artificialintelligence-news.com/news/alibaba-cloud-global-ai-growth-new-models-and-tools/ https://www.artificialintelligence-news.com/news/alibaba-cloud-global-ai-growth-new-models-and-tools/#respond Tue, 08 Apr 2025 17:56:13 +0000 https://www.artificialintelligence-news.com/?p=105235 Alibaba Cloud has expanded its AI portfolio for global customers with a raft of new models, platform enhancements, and Software-as-a-Service (SaaS) tools. The announcements, made during its Spring Launch 2025 online event, underscore the drive by Alibaba to accelerate AI innovation and adoption on a global scale. The digital technology and intelligence arm of Alibaba […]

The post Alibaba Cloud targets global AI growth with new models and tools appeared first on AI News.

]]>
Alibaba Cloud has expanded its AI portfolio for global customers with a raft of new models, platform enhancements, and Software-as-a-Service (SaaS) tools.

The announcements, made during its Spring Launch 2025 online event, underscore the drive by Alibaba to accelerate AI innovation and adoption on a global scale.

The digital technology and intelligence arm of Alibaba is focusing on meeting increasing demand for AI-driven digital transformation worldwide.

Selina Yuan, President of International Business at Alibaba Cloud Intelligence, said: “We are launching a series of Platform-as-a-Service(PaaS) and AI capability updates to meet the growing demand for digital transformation from across the globe.

“These upgrades allow us to deliver even more secure and high-performance services that empower businesses to scale and innovate in an AI-driven world.”

Alibaba expands access to foundational AI models

Central to the announcement is the broadened availability of Alibaba Cloud’s proprietary Qwen large language model (LLM) series for international clients, initially accessible via its Singapore availability zones.

This includes several specialised models:

  • Qwen-Max: A large-scale Mixture of Experts (MoE) model.
  • QwQ-Plus: An advanced reasoning model designed for complex analytical tasks, sophisticated question answering, and expert-level mathematical problem-solving.
  • QVQ-Max: A visual reasoning model capable of handling complex multimodal problems, supporting visual input and chain-of-thought output for enhanced accuracy.
  • Qwen2.5-Omni-7b: An end-to-end multimodal model.

These additions provide international businesses with more powerful and diverse tools for developing sophisticated AI applications.

Platform enhancements power AI scale

To support these advanced models, Alibaba Cloud’s Platform for AI (PAI) received significant upgrades aimed at delivering scalable, cost-effective, and user-friendly generative AI solutions.

Key enhancements include the introduction of distributed inference capabilities within the PAI-Elastic Algorithm Service (EAS). Utilising a multi-node architecture, this addresses the computational demands of super-large models – particularly those employing MoE structures or requiring ultra-long-text processing – to overcome limitations inherent in traditional single-node setups.

Furthermore, PAI-EAS now features a prefill-decode disaggregation function designed to boost performance and reduce operational costs.

Alibaba Cloud reported impressive results when deploying this with the Qwen2.5-72B model, achieving a 92% increase in concurrency and a 91% boost in tokens per second (TPS).

The PAI-Model Gallery has also been refreshed, now offering nearly 300 open-source models—including the complete range of Alibaba Cloud’s own open-source Qwen and Wan series. These are accessible via a no-code deployment and management interface.

Additional new PAI-Model Gallery features – like model evaluation and model distillation (transferring knowledge from large to smaller, more cost-effective models) – further enhance its utility.

Alibaba integrates AI into data management

Alibaba Cloud’s flagship cloud-native relational database, PolarDB, now incorporates native AI inference powered by Qwen.

PolarDB’s in-database machine learning capability eliminates the need to move data for inference workflows, which significantly cuts processing latency while improving efficiency and data security.

The feature is optimised for text-centric tasks such as developing conversational Retrieval-Augmented Generation (RAG) agents, generating text embeddings, and performing semantic similarity searches.

Additionally, the company’s data warehouse, AnalyticDB, is now integrated into Alibaba Cloud’s generative AI development platform Model Studio.

This integration serves as the recommended vector database for RAG solutions. This allows organisations to connect their proprietary knowledge bases directly with AI models on the platform to streamline the creation of context-aware applications.

New SaaS tools for industry transformation

Beyond infrastructure and platform layers, Alibaba Cloud introduced two new SaaS AI tools:

  • AI Doc: An intelligent document processing tool using LLMs to parse diverse documents (reports, forms, manuals) efficiently. It extracts specific information and can generate tailored reports, such as ESG reports when integrated with Alibaba Cloud’s Energy Expert sustainability solution.
  • Smart Studio: An AI-powered content creation platform supporting text-to-image, image-to-image, and text-to-video generation. It aims to enhance marketing and creative outputs in sectors like e-commerce, gaming, and entertainment, enabling features like virtual try-ons or generating visuals from text descriptions.

All these developments follow Alibaba’s announcement in February of a $53 billion investment over the next three years dedicated to advancing its cloud computing and AI infrastructure.

This colossal investment, noted as exceeding the company’s total AI and cloud expenditure over the previous decade, highlights a deep commitment to AI-driven growth and solidifies its position as a major global cloud provider.

“As cloud and AI become essential for global growth, we are committed to enhancing our core product offerings to address our customers’ evolving needs,” concludes Yuan.

See also: Amazon Nova Act: A step towards smarter, web-native AI agents

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Alibaba Cloud targets global AI growth with new models and tools appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/alibaba-cloud-global-ai-growth-new-models-and-tools/feed/ 0
Kay Firth-Butterfield, formerly WEF: The future of AI, the metaverse and digital transformation https://www.artificialintelligence-news.com/news/kay-firth-butterfield-formerly-wef-the-future-of-ai-the-metaverse-and-digital-transformation/ https://www.artificialintelligence-news.com/news/kay-firth-butterfield-formerly-wef-the-future-of-ai-the-metaverse-and-digital-transformation/#respond Thu, 03 Apr 2025 06:48:00 +0000 https://www.artificialintelligence-news.com/?p=105112 Kay Firth-Butterfield is a globally recognised leader in ethical artificial intelligence and a distinguished AI ethics speaker. As the former head of AI and Machine Learning at the World Economic Forum (WEF) and one of the foremost voices in AI governance, she has spent her career advocating for technology that enhances, rather than harms, society. […]

The post Kay Firth-Butterfield, formerly WEF: The future of AI, the metaverse and digital transformation appeared first on AI News.

]]>
Kay Firth-Butterfield is a globally recognised leader in ethical artificial intelligence and a distinguished AI ethics speaker. As the former head of AI and Machine Learning at the World Economic Forum (WEF) and one of the foremost voices in AI governance, she has spent her career advocating for technology that enhances, rather than harms, society.

We spoke to Kay to discuss the promise and pitfalls of generative AI, the future of the Metaverse, and how organisations can prepare for a decade of unprecedented digital transformation.

Generative AI has captured global attention, but there’s still a great deal of misunderstanding around what it actually is. Could you walk us through what defines generative AI, how it works, and why it’s considered such a transformative evolution of artificial intelligence?

It’s very exciting because it represents the next iteration of artificial intelligence. What generative AI allows you to do is ask questions of the world’s data simply by typing a prompt. If we think back to science fiction, that’s essentially what we’ve always dreamed of — just being able to ask a computer a question and have it draw on all its knowledge to provide an answer.

How does it do that? Well, it predicts which word is likely to come next in a sequence. It does this by accessing enormous volumes of data. We refer to these as large language models. Essentially, the machine ‘reads’ — or at least accesses — all the data available on the open web. In some cases, and this is an area of legal contention, it also accesses IP-protected and copyrighted material. We can expect a great deal of legal debate in this space.

Once the model has ingested all this data, it begins to predict what word naturally follows another, enabling it to construct highly complex and nuanced responses. Anyone who has experimented with it knows that it can return some surprisingly eloquent and insightful content simply through this predictive capability.

Of course, sometimes it gets things wrong. In the AI community, we call this ‘hallucination’ — essentially, the system fabricates information. That’s a serious issue because in order to rely on AI-generated outputs, we need to reach a point where we can trust the responses. The problem is, once a hallucination enters the data pool, it can be repeated and reinforced by the model.

While much has been said about generative AI’s technical potential, what do you see as the most meaningful societal and business benefits it offers? And what challenges must we address to ensure these advantages are equitably realised?

AI is now accessible to everyone, and that’s incredibly powerful. It’s a hugely democratising tool. It means that small and medium-sized enterprises, which previously couldn’t afford to leverage AI, now can.

However, we also need to be aware that most of the world’s data is created in the United States first, followed by Europe and China. There are clear challenges regarding the datasets these large language models are trained on. They’re not truly using ‘global’ data. They’re working with a limited subset. That has led to discussions around digital colonisation, where content generated from American and European data is projected onto the rest of the world, with an implicit expectation that others will adopt and use it.

Different cultures, of course, require different responses. So, while there are countless benefits to generative AI, there are also significant challenges that we must address if we want to ensure fair and inclusive outcomes.

The Metaverse has seen both hype and hesitation in recent years. From your perspective, what is the current trajectory of the Metaverse, and how do you see its role evolving within business environments over the next five years?

It’s interesting. We went through a phase of huge excitement around the Metaverse, where everyone wanted to be involved. But now we’ve entered more of a Metaverse winter, or perhaps autumn, as it’s become clear just how difficult it is to create compelling content for these immersive spaces.

We’re seeing strong use cases in industrial applications, but we’re still far from achieving that Ready Player One vision — where we live, shop, buy property, and fully interact in 3D virtual environments. That’s largely because the level of compute power and creative resources needed to build truly immersive experiences is enormous.

In five years’ time, I think we’ll start to see the Metaverse delivering on more of its promises for business. Customers may enjoy exceptional shopping experiences—entering virtual stores rather than simply browsing online, where they can ‘feel’ fabrics virtually and make informed decisions in real time.

We may also see remote working evolve, where employees collaborate inside the Metaverse as if they were in the same room. One study found that younger workers often lack adequate supervision when working remotely. In a Metaverse setting, you could offer genuine, interactive supervision and mentorship. It may also help with fostering colleague relationships that are often missed in remote work settings.

Ultimately, the Metaverse removes physical constraints and offers new ways of working and interacting—but we’ll need balance. Many people may not want to spend all their time in fully immersive environments.

Looking ahead, which emerging technologies and AI-driven trends do you anticipate will have the most profound global impact over the next decade. And how should we be preparing for their implications, both economically and ethically?

That’s a great question. It’s a bit like pulling out a crystal ball. But without doubt, generative AI is one of the most significant shifts we’re seeing today. As the technology becomes more refined, it will increasingly power new AI applications through natural language interactions.

Natural Language Processing (NLP) is the AI term for the machine’s ability to understand and interpret human language. In the near future, only elite developers will need to code manually. The rest of us will interact with machines by typing or speaking requests. These systems will not only provide answers, but also write code on our behalf. It’s incredibly powerful, transformative technology.

But there are downsides. One major concern is that AI sometimes fabricates information. And as generative AI becomes more prolific, it’s generating massive volumes of data 24/7. Over time, machine-generated data may outnumber human data, which could distort the digital landscape. We must ensure the AI doesn’t perpetuate falsehoods it has previously generated.

Looking further ahead, this shift raises deep questions about the future of human work. If AI systems can outperform humans in many tasks without fatigue, what becomes of our role? There may be cost savings, but also the very real risk of widespread unemployment.

AI also powers the Metaverse, so progress there is tied to improvements in AI capabilities. I’m also very excited about synthetic biology, which could see huge advancements driven by AI. There’s also likely to be significant interplay between quantum computing and AI, which could bring both benefits and serious challenges.

We’ll see more Internet of Things (IoT) devices as well—but that introduces new issues around security and data protection.

It’s a time of extraordinary opportunity, but also serious risks. Some worry about artificial general intelligence becoming sentient, but I don’t see that as likely just yet. Current models lack causal reasoning. They’re still predictive tools. We would need to add something fundamentally different to reach human-level intelligence. But make no mistake—we are entering an incredibly exciting era.

Adopting new technologies can be both an opportunity and a risk for businesses. In your view, how can organisations strike the right balance between embracing digital transformation and making strategic, informed decisions about AI adoption?

I think it’s vital to adopt the latest technologies, just as it would have been important for Kodak to see the shift coming in the photography industry. Businesses that fail to even explore digital transformation risk being left behind.

However, a word of caution: it’s easy to jump in too quickly and end up with the wrong AI solution — or the wrong systems entirely — for your business. So, I would advise approaching digital transformation with careful thought. Keep your eyes open, and treat each step as a deliberate, strategic business decision.

When you decide that you’re ready to adopt AI, it’s crucial to hold your suppliers to account. Ask the hard questions. Ask detailed questions. Make sure you have someone in-house, or bring in a consultant, who knows enough to help you interrogate the technology properly.

As we all know, one of the greatest wastes of money in digital transformation happens when the right questions aren’t asked up front. Getting it wrong can be incredibly costly, so take the time to get it right.

Photo by petr sidorov on Unsplash

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation ConferenceBlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

The post Kay Firth-Butterfield, formerly WEF: The future of AI, the metaverse and digital transformation appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/kay-firth-butterfield-formerly-wef-the-future-of-ai-the-metaverse-and-digital-transformation/feed/ 0
Tony Blair Institute AI copyright report sparks backlash https://www.artificialintelligence-news.com/news/tony-blair-institute-ai-copyright-report-sparks-backlash/ https://www.artificialintelligence-news.com/news/tony-blair-institute-ai-copyright-report-sparks-backlash/#respond Wed, 02 Apr 2025 11:04:11 +0000 https://www.artificialintelligence-news.com/?p=105140 The Tony Blair Institute (TBI) has released a report calling for the UK to lead in navigating the complex intersection of arts and AI. According to the report, titled ‘Rebooting Copyright: How the UK Can Be a Global Leader in the Arts and AI,’ the global race for cultural and technological leadership is still up […]

The post Tony Blair Institute AI copyright report sparks backlash appeared first on AI News.

]]>
The Tony Blair Institute (TBI) has released a report calling for the UK to lead in navigating the complex intersection of arts and AI.

According to the report, titled ‘Rebooting Copyright: How the UK Can Be a Global Leader in the Arts and AI,’ the global race for cultural and technological leadership is still up for grabs, and the UK has a golden opportunity to take the lead.

The report emphasises that countries that “embrace change and harness the power of artificial intelligence in creative ways will set the technical, aesthetic, and regulatory standards for others to follow.”

Highlighting that we are in the midst of another revolution in media and communication, the report notes that AI is disrupting how textual, visual, and auditive content is created, distributed, and experienced, much like the printing press, gramophone, and camera did before it.

“AI will usher in a new era of interactive and bespoke works, as well as a counter-revolution that celebrates everything that AI can never be,” the report states.

However, far from signalling the end of human creativity, the TBI suggests AI will open up “new ways of being original.”

The AI revolution’s impact isn’t limited to the creative industries; it’s being felt across all areas of society. Scientists are using AI to accelerate discoveries, healthcare providers are employing it to analyse X-ray images, and emergency services utilise it to locate houses damaged by earthquakes.

The report stresses that these cross-industry advancements are just the beginning, with future AI systems set to become increasingly capable, fuelled by advancements in computing power, data, model architectures, and access to talent.

The UK government has expressed its ambition to be a global leader in AI through its AI Opportunities Action Plan, announced by Prime Minister Keir Starmer on 13 January 2025. For its part, the TBI welcomes the UK government’s ambition, stating that “if properly designed and deployed, AI can make human lives healthier, safer, and more prosperous.”

However, the rapid spread of AI across sectors raises urgent policy questions, particularly concerning the data used for AI training. The application of UK copyright law to the training of AI models is currently contested, with the debate often framed as a “zero-sum game” between AI developers and rights holders. The TBI argues that this framing “misrepresents the nature of the challenge and the opportunity before us.”

The report emphasises that “bold policy solutions are needed to provide all parties with legal clarity and unlock investments that spur innovation, job creation, and economic growth.”

According to the TBI, AI presents opportunities for creators—noting its use in various fields from podcasts to filmmaking. The report draws parallels with past technological innovations – such as the printing press and the internet – which were initially met with resistance, but ultimately led to societal adaptation and human ingenuity prevailing.

The TBI proposes that the solution lies not in clinging to outdated copyright laws but in allowing them to “co-evolve with technological change” to remain effective in the age of AI.

The UK government has proposed a text and data mining exception with an opt-out option for rights holders. While the TBI views this as a good starting point for balancing stakeholder interests, it acknowledges the “significant implementation and enforcement challenges” that come with it, spanning legal, technical, and geopolitical dimensions.

In the report, the Tony Blair Institute for Global Change “assesses the merits of the UK government’s proposal and outlines a holistic policy framework to make it work in practice.”

The report includes recommendations and examines novel forms of art that will emerge from AI. It also delves into the disagreement between rights holders and developers on copyright, the wider implications of copyright policy, and the serious hurdles the UK’s text and data mining proposal faces.

Furthermore, the Tony Blair Institute explores the challenges of governing an opt-out policy, implementation problems with opt-outs, making opt-outs useful and accessible, and tackling the diffusion problem. AI summaries and the problems they present regarding identity are also addressed, along with defensive tools as a partial solution and solving licensing problems.

The report also seeks to clarify the standards on human creativity, address digital watermarking, and discuss the uncertainty around the impact of generative AI on the industry. It proposes establishing a Centre for AI and the Creative Industries and discusses the risk of judicial review, the benefits of a remuneration scheme, and the advantages of a targeted levy on ISPs to raise funding for the Centre.

However, the report has faced strong criticism. Ed Newton-Rex, CEO of Fairly Trained, raised several concerns on Bluesky. These concerns include:

  • The report repeats the “misleading claim” that existing UK copyright law is uncertain, which Newton-Rex asserts is not the case.
  • The suggestion that an opt-out scheme would give rights holders more control over how their works are used is misleading. Newton-Rex argues that licensing is currently required by law, so moving to an opt-out system would actually decrease control, as some rights holders will inevitably miss the opt-out.
  • The report likens machine learning (ML) training to human learning, a comparison that Newton-Rex finds shocking, given the vastly different scalability of the two.
  • The report’s claim that AI developers won’t make long-term profits from training on people’s work is questioned, with Newton-Rex pointing to the significant funding raised by companies like OpenAI.
  • Newton-Rex suggests the report uses strawman arguments, such as stating that generative AI may not replace all human paid activities.
  • A key criticism is that the report omits data showing how generative AI replaces demand for human creative labour.
  • Newton-Rex also criticises the report’s proposed solutions, specifically the suggestion to set up an academic centre, which he notes “no one has asked for.”
  • Furthermore, he highlights the proposal to tax every household in the UK to fund this academic centre, arguing that this would place the financial burden on consumers rather than the AI companies themselves, and the revenue wouldn’t even go to creators.

Adding to these criticisms, British novelist and author Jonathan Coe noted that “the five co-authors of this report on copyright, AI, and the arts are all from the science and technology sectors. Not one artist or creator among them.”

While the report from Tony Blair Institute for Global Change supports the government’s ambition to be an AI leader, it also raises critical policy questions—particularly around copyright law and AI training data.

(Photo by Jez Timms)

See also: Amazon Nova Act: A step towards smarter, web-native AI agents

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Tony Blair Institute AI copyright report sparks backlash appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/news/tony-blair-institute-ai-copyright-report-sparks-backlash/feed/ 0