Google vs OpenAI: The AI “war” is on

Developments in its field artificial intelligence Things are heating up now, as Google and OpenAI compete over who will ultimately win the “war” of new technologies. Last week, the two “giant” companies began making a barrage of important announcements regarding upcoming changes related to computers and mobile phones.

According to gzeroBoth Google and OpenAI held major events last week focusing on their achievements and upcoming plans in the field of artificial intelligence.

At its developer conference, Google essentially announced that it plans to integrate AI into all of its products, even the search engine that bears its name. Google search, which we all know. If you’ve Googled anything recently, you may have noticed that twin, Google’s chatbot has started popping up and suggesting answers to your questions. It also announced Veo, an AI app that creates videos similar to OpenAI’s Sora, and Project Astra, an AI platform that describes, among other things, what a mobile phone camera captures.

On the other hand, OpenAI introduced the development of ChatGPT, GPT-4o. The new language model will act more like a voice assistant than a chatbot and may soon eliminate the usefulness of products like “Alexa” or “Siri.”

The future of artificial intelligence, according to the company, will have many uses. This means that forms will be able to quickly and easily process text, images, video and audio and provide responses to users.

But most importantly, this evolution of ChatGPT (on smartphones and desktops) will be free. So millions of people who are not accustomed to paying for ChatGPT’s premium service will now have access to its cutting-edge model.

The “battle” for multimodal and emotional AI

Already in 2023, companies – the “giants” were explaining to us how important it is Large linguistic models (Master) in Artificial Intelligence They can summarize documents, emails, and even poems.

With the new chatbots that Google and OpenAI have been developing since this year, it seems we are now heading into a new era in terms of AI “warfare.” The heavy “prizes” in this battle will now be the capture of the so-called Multimodal Artificial Intelligence (MAi) and Emotional Artificial Intelligence (EQ).

In terms of MAi, leading computational models in artificial intelligence can understand and analyze not only text but also sound, images and computer code, and generate responses to the same media.

In a simple example, OpenAI’s ChatGPT or Google’s Gemini could capture a visual image (perhaps via a smartphone camera) and verbally describe the image’s content. “Multimedia radically expands the kind of questions we can ask and the answers we can get as feedback,” Google CEO Sundar Pichai said at the company’s I/O event.

On Monday (05/13), OpenAI unveiled an upgraded version of ChatGPT, powered by the new GPT-4o model (the ‘o’ stands for ‘omni’). What’s most noticeable about the new ChatGPT is how “human” the interactions with the chatbot look. This is mainly due to its similar sound and ChatGPT’s speaking voice behavior. Her tone is strangely human – she sounds natural and expressive, telling jokes and immediately stopping when she hears the user start talking. The audible sound represents another mode, just like the text or image modes that the model understands.

ChatGPT adds another method – emotional intelligence or “EQ”. It appears to be able to detect emotion in a user’s voice (in a demo on Monday, the chatbot detected tension in an OpenAI researcher’s voice) and then influence its responses with the appropriate emotion (for the researcher, empathy).

Meanwhile, Google will launch a similar voice interaction program called “Gemini Live” later this year.

Importantly, AI models have developed the ability to “reason” about these multimodal inputs. For example, Google showed at I/O how its Gemini chatbot can help a user plan an upcoming trip. It starts by extracting trip logistics (flights, hotels, etc.) from reservations emailed to the user’s Gmail account – and then, after gathering some information about the user’s interests, decides which activities can best fit the time available. Due to their location (based on Google Maps data) relative to the user’s hotel.

In their respective demos, ChatGPT and Gemini were shown math problems written on a whiteboard and asked for help solving them. Both companies showed their chatbots reading computer code from the screen and analyzing it, for example. In fact, the computational code may be the “key” to understanding how these AI models are now gaining the ability to reason and make judgements.

The “battle” for multimodal and emotional AI

Is this what the PS5 Pro will look like? (Image)

Finally, Windows 11 24H2 update significantly boosts AMD Ryzen – Windows 11 performance

Heart Surgeon Reveals The 4 Things He ‘Totally Avoids’ In His Life

Acrylic vs. Must-Have Acrylic Brushes for Perfect Nail Art

Technological Advancements in Tortoise Tracking and Monitoring

F-16 crashes in Ukraine – pilot dies due to his own error

“Recycling – Changing the water heater”: the possibility of paying the financing to the institution once or partially

The “battle” for multimodal and emotional AI

Leave a Reply Cancel reply

More Stories

Is this what the PS5 Pro will look like? (Image)

Finally, Windows 11 24H2 update significantly boosts AMD Ryzen – Windows 11 performance

Heart Surgeon Reveals The 4 Things He ‘Totally Avoids’ In His Life

You may have missed

Acrylic vs. Must-Have Acrylic Brushes for Perfect Nail Art

Technological Advancements in Tortoise Tracking and Monitoring

F-16 crashes in Ukraine – pilot dies due to his own error

“Recycling – Changing the water heater”: the possibility of paying the financing to the institution once or partially