Remember Dibsic? Two new modeling of artificial intelligence says they are better

Artificial intelligence companies are used to measure themselves against the Openai leader. no more. Now that Deepseek appeared in China as candidates, it has become sudden.

On Monday, Dibsic turned the artificial intelligence industry on its head, Billions of dollars cause losses At Wall Street while asking questions about the efficiency of some American startups - and investment capital - in fact.

Now, two forces of artificial intelligence entered the fenugreek intelligence: Allen International Institute in Seattle and Albaba in China; Both claim that their models are equal with or better than Deepseek V3.

Amnesty International Institute, a US -based research organization, is known for release more modest Vision model named molmoToday, it revealed a new version of Tülu 3, which is a large free language model and an open source 405 billion.

"We are pleased to announce the launch of Tülu 3 405b-the first post-training recipes for the fully open training for the largest open-weight models," said Paul Allen's non-profit organization. Blog post. "With this version, we show the ability to expand and effectively after training applied on the teacher scale 405B."

For those who like to compare sizes, the latest LLM in Meta, Llama-3.3, He has 70 billion teachers, and its largest model so far is Call-3.1 405B- The same size Tülu 3.

The model was so great that he demanded unusual mathematical resources, which requires 32 knots with 256 graphics processing units that work in parallel to training.

The Allen Institute struck several roads on the road while building its model. The tremendous size of Tülu 3 means that the team must divide the work burden across hundreds of specialized computer chips, with 240 chips that deal with the training process while 16 other operations run in actual time.

Even with this huge computer force, the system was repeatedly shattered and requested to oversee the clock to keep it in operation.

Tülu 3 penetration focuses on new reinforcement learning with the RLVR, which has shown special strength in sports thinking tasks.

Each RLVR recurred took about 35 minutes, with a conclusion required 550 seconds, the transfer of weight 25 seconds, and 1500 seconds, with Amnesty International improves problem solving with each tour.

Photo: AI2

Reinforcing learning with RLVR is a training approach that looks like an advanced educational system.

Artificial intelligence has received specific tasks, such as solving mathematics problems, and obtained immediate comments on whether their answers are correct.

However, unlike traditional artificial intelligence training (such as the training that Openai uses to train ChatgPT), where human comments can be self, RLVR is only rewarded of artificial intelligence when he clearly produced the correct answers, similar to how exactly the mathematics teacher knows when it is Solve the right or wrong student.

This is why the model is very good in mathematics and logic problems, but not the best in other tasks such as creative writing, playing roles or realistic analysis.

The model is available in Allen Ai StadiumFree location with a similar user interface for ChatGPT and other Chatbots Amnesty International.

Our tests confirmed the large model.

It is very good in solving problems and applying logic. We have provided various random problems from a number of mathematical standards and science, and was able to take good answers, so that it is easier to understand when compared to the sample answers Those criteria presented.

However, it failed in other logical tasks related to the language that did not include mathematics, such as writing sentences that end with a specific word.

Also, Tülu 3 is not a multimedia. Instead, hold what he knew better - select the text. There is no imaginary images or tricks of a series of ideas here.

On the upper side, the interface is free to use, and requires simple login, either through the Allen AI stadium or by downloading weights to operate locally.

The model is available for download via EmbroideryWith alternatives from 8 billion of the parameters to the huge teacher release 405 billion.

The Chinese technology giant enters the battle

Meanwhile, China is not resting on the glories of Depsic.

Amid all the robe, Ali Baba decreased QWEN 2.5-MaxA huge language model trained on more than 20 trillion symbol.

The Chinese technology giant has released the model during the new lunar year, a few days after Deepseek R1 disrupted the market.

Standard tests have shown that QWEN 2.5-Max outperformed Deepseek V3 in many major areas, including coding, mathematics, logic and general knowledge, and was also evaluated using criteria such as Arena-Hard, LiveBench, LiveCodebench and GPQA-Diamond.

The model showed competitive results against industrial leaders such as GPT-4O and Claude 3.5-Sonne, according to the model card.

QWEN3.5 Max leads to artificial intelligence standards
Photo: Ali Baba

Alibaba made the model through its cloud platform with Fire compatible with the openAllow developers to merge them using familiar tools and methods.

The company's documents showed detailed examples of implementation, indicating a widespread payment of adoption.

But Alibaba's QWEN Chat's web portal is the best option for public users and it seems very impressive - for those who are well in creating an account there. The Chatbot interface is likely to be the most diverse available.

QWEN Chat Users are allowed to create text, symbol and images without a defect. It also supports web search functions, antiques, and even a very good video generator, all in the same user interface - for free.

It also has a unique function in which users can choose two different models for "fighting" against each other to provide the best response.

In general, the QWEN user interface is more varied than Allen Ai.

In text responses, QWEN2.5-MAX has proven to be better than Tülu 3 in creative writing and thinking tasks that included language analysis. For example, he was able to create phrases ending in a specific word.

Its video generator is a nice addition and it can be said that it is equal with offers like Kling or Luma Labs - the best than the Surah can offer.

Also, his photo generator provides realistic and fun images, which indicates a clear feature on the Dall-E 3 from Openai, but it is clear behind the upper models like Flux or Midjourney.

The triple version of Deepseek, QWEN2.5-MAX and Tülu 3 gave the world of open source intelligence in its most important batch for some time.

Deepseek had already turned into his heads by building his R1 thinking model using earlier QWEN Technology for distillationAmnesty International Open source may be proportional to billions of dollars in technology giants in a small part of the cost.

Now QWEN2.5-MAX has increased the bet. If Deepseek Playbook Followed - QWEN's FC - can pack the next thinking model for larger punch.

However, this can be a good opportunity for the Allen Institute. Openai is racing to launch its O3 thinking form, which some industry analysts estimated to cost users to cost users 1000 dollars for each query.

If this is the case, Tülu 3 arrival may be a great open-source alternative-especially for developers cautious about building on Chinese technology due to security concerns or organizational requirements.

Edit Josh Ketner and Sebastian Senkler

Smart in general Newsletter

Weekly journey narrated GEN, AI Tawylidi model.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *