AI Cost War Is Making Devices Pricier

● AI Cost War Turns Devices Into Pricier Luxury

From GPT‑5.6, video AI, voice agents, and on-device AI… the global AI market has now moved beyond a “model race” and into a “platform war”

The core point this week is exactly this.

What the limited release of GPT‑5.6 means for the direction of U.S. AI regulation,

the real-time multimodal evolution shown by Bidi 1 and Wan Streamer,

the agent collaboration structure being reshaped by Claude Tag and Sakana AI,

the automation of video and game production triggered by Seedance 2.5, Krea 2, and Unity AI,

and the trend in which AI infrastructure costs are being passed on to consumer prices, as seen in Apple’s price increases and transition to M7,

will all be organized for you at once.

1. GPT‑5.6 was released, but why was it a limited preview instead of a full public launch?

GPT‑5.6 was first opened only to select companies approved by the U.S. government.

In other words, the issue is no longer just that the model is powerful; the more important question has become “who gets to use it first.”

This is not simply news about a model launch. It should be seen as a signal that the U.S. AI industry has effectively entered a stage connected to national security.

OpenAI released GPT‑5.6 in three versions.

They are Sol, the highest-performance version; Tera, a balanced model for everyday work; and Luna, a fast and affordable option.

This structure shows that the AI market will be reorganized not around “one massive model,” but around “a portfolio by use case.”

The pricing is also interesting.

For Sol, the pricing is around $5 for input, $0.50 for cached input, and $30 for output.

It is expensive, but not completely unreasonable in the top-tier model market.

In other words, ultra-high-performance models still carry a premium, but for enterprises, the price has come down to a level where adoption is worth considering.

In benchmarks, it also showed strong performance, scoring 91.9 on Terminal Bench at the Ultra level.

However, in areas such as vulnerability research or offensive use cases, there are signs that it was deliberately tuned conservatively.

This part is important because, going forward, high-performance AI will be evaluated not only by whether it is “smart,” but by whether it is “safely smart.”

2. What it means that the U.S. government has begun directly reviewing access rights to AI models

As reported by The Washington Post, the U.S. government is expanding policies that directly review which companies can access the latest AI technologies.

This means AI is no longer merely a matter of competition among private platforms.

Semiconductors, cloud infrastructure, model APIs, defense, and cybersecurity are being tied together into one large axis.

One point to watch here is the sense that Google Gemini is relatively avoiding regulation.

By contrast, OpenAI-related systems are becoming much more sensitive review targets.

This gap will be an enormous variable in the future structure of global AI competition.

That is because the stronger regulation becomes, the faster movements to strengthen open-weight models and sovereign AI infrastructure are likely to accelerate.

The acceleration of sovereign AI development in Canada is also part of the same context.

The continued growth of China’s open-weight ecosystem is also connected to this trend.

Ultimately, the AI hegemony race is shifting from “who can build the smartest model” to “who can distribute more broadly with fewer restrictions.”

3. Bidi 1: the real next stage of voice AI is “interruption during conversation”

Bidi 1 is a next-generation bidirectional voice model, and its core point is that it can interrupt while someone is speaking.

This is a much bigger change than it may seem.

Until now, voice AI often had to wait until a person finished speaking before responding.

But Bidi 1 can naturally intervene in the middle of a conversation, continue the exchange, and quickly shift context.

In other words, it has become closer to a “conversational partner” than a “command-based voice assistant.”

Its emotional expression has also noticeably improved.

Sad tones, comforting tones, and even crying-like expressions are implemented more naturally.

This technology can be directly applied to counseling, education, game NPCs, live commerce, virtual humans, and customer support.

It will be especially noticeable in psychological counseling assistants and immersive game characters.

Going forward, the competitiveness of voice AI will come less from “accurate answers” and more from “how human-like it is in responding to the flow of someone’s speech.”

4. OpenAI’s Jalapeño chip: the era in which AI helps build AI hardware faster

OpenAI unveiled its own inference chip, Jalapeño.

It reportedly took only nine months from the start of design to tape-out.

That speed is highly unusual.

The core point is that OpenAI pushed forward software-hardware co-development with its engineering team, and OpenAI models directly helped with some design optimization.

This is a truly important point.

AI is entering a stage where it accelerates not only software creation, but also semiconductor design and manufacturing optimization.

OpenAI claimed roughly 50% cost savings compared with AI GPUs for inference, along with better performance per watt.

The goal is clear.

As AI usage explodes, profitability can only be secured if inference costs are reduced internally.

Going forward, the real competition among big tech companies is likely to shift from model performance to “who can reduce inference costs faster.”

5. Samsung Electronics has also begun fully adopting the latest AI as an internal productivity tool

Samsung Electronics has begun providing ChatGPT and Codex to all employees.

Until now, the atmosphere had been that only internal GPU-based proprietary models were used because of security concerns.

But now, the company appears to have concluded that productivity cannot be achieved without using the latest AI.

Since this is available to Korean DX employees as well as global DX employees, it carries strong symbolic significance among major Korean corporations.

This is not just news of adoption.

It is a case that shows how much enterprise development, planning, documentation, and analysis work will be reorganized around AI going forward.

In other words, Korean companies have now moved from asking “should we try using AI?” to worrying “will we fall behind if we do not use AI?”

6. Claude Tag: Slack is becoming an operating system for workplace AI

Claude Tag is a feature that lets users invite Claude into Slack like a human teammate and assign it work.

This is huge.

That is because AI is no longer separate from work; it has entered the collaboration spaces where actual work takes place.

Users ask questions in Slack, Claude understands the context, and it carries out tasks.

At Anthropic, 65% of code is reportedly written using an internal version of Claude Tag.

This means AI collaboration has moved beyond the experimental stage and into the operational stage.

As Andrej Karpathy has said, this is a change on the level of redesigning the entire work structure of an organization.

Going forward, the competitiveness of collaboration tools will not come from chat features, but from how naturally they can attach powerful agents.

7. Sakana AI’s multimodel orchestration: combinations are stronger than a single model

Sakana AI is a representative case that challenges single-model-centered thinking.

Sakana Fugu works like an orchestration system that combines multiple models into one coordinated structure.

The core point is not “which model is the smartest,” but “which models should be combined for which problem.”

The method introduced in the paper is also interesting.

It is not simple routing. The system looks at a problem and generates a workflow, deciding how many agents to call, who will take which role, and who will make the final judgment.

It then continues to search for the optimal structure through reinforcement learning.

For example, GPT-style models may handle math, planning, and algorithms; Claude may handle debugging and security; and Gemini may handle scientific knowledge and biology.

This structure provides a very important hint for how companies will adopt AI going forward.

In other words, multimodel operational capability will become more competitive than going all in on a single model.

8. Unite Seoul 2026: game development now centers on revealing AI pipelines

Unite Seoul 2026, Unity’s largest event, will be held in Korea.

The biggest meaning of this event is that it will show, in a practical way, “how to make games with AI.”

Unity AI already understands game object composition, components, and project context in real time while working.

A demo was also revealed in which a museum map was created from a single photo and then converted into 3D.

In other words, the barrier to entry for level design and map creation has been significantly lowered.

This event is especially notable.

AI development pipelines from real game companies such as Supercent and CyberAgent will be revealed.

This means attendees will be able to see actual production structures from the field, not just the general introductions commonly seen on YouTube.

It will also include hands-on training, monetization sessions, migration tips, and new demo releases.

For game developers in the AI era, this event is effectively a compressed version of the industry trend.

9. Google AI Studio and Gemini: development and learning UX are evolving together

Design Variations in Google AI Studio is a feature that lets users transform vibe-coded outputs into various design styles.

AI now helps rapidly experiment not only with feature implementation, but also with visual completeness.

Another notable feature is Study Notebook.

When lecture materials, notes, and class files are added, it generates study questions, organizes exam preparation, and analyzes strengths and weaknesses.

This is useful not only for students, but also for office workers.

It can be highly useful for adding meeting materials and receiving summaries, or for adding training content and creating study checklists.

10. Apple’s price increases and transition to M7: the AI memory war is ultimately reaching consumer prices

Apple has raised MacBook and iPad prices by nearly 15%.

This appears to be the result of rising RAM prices, higher component costs, and efforts to secure margins on premium product lines.

More importantly, Apple’s strategy is to skip the high-end M6 and move toward M7.

This means Apple now intends to jump to AI-centered hardware instead of extending an ambiguous generation for too long.

M7 is expected to greatly strengthen on-device AI and significantly increase memory bandwidth.

Ultimately, this means that as AI features increase, consumer electronics prices are also likely to rise.

AI is a software innovation, but the price consumers feel appears first in hardware.

11. Why video AI is evolving the fastest right now: Seedance 2.5, Dreamina, and Krea 2

Video generation is currently the hottest battlefield.

ByteDance has previewed Seedance 2.5.

The generation length will double, support up to 30 seconds, and expand support for reference images, audio, and video to as many as 50 items.

This means advertising videos, short-form content, brand mood videos, and cinematic test production will become much easier.

Dreamina’s Seedance 2.0 Mini is moving in a direction that is faster and cheaper while maintaining quality.

In actual production, these types of models are used more often.

Rather than long and impressive demos, models that can be iterated quickly generate real revenue.

The Krea 2 weight model has also been released.

Under affordable licensing terms, it shows fairly high quality in both animation and photorealistic styles.

Going forward, the video production market will be reorganized not around “who makes better content,” but around “who can iterate faster, cheaper, and more often.”

12. Wan Streamer: AI now holds real-time streaming conversations like a video call

Wan Streamer is a new benchmark for real-time AI streaming interaction.

It processes text, audio, and video with a single transformer.

Its latency is around 200 milliseconds, making it feel almost like a real-time conversation.

Facial movements and conversational responses are also natural.

This can enter virtual YouTubing, live sales, online counseling, and real-time education very strongly.

It will be especially competitive in services where “human-like response speed” is important.

13. Unitree R1 and the robotics market: humanoids are now moving toward low-cost mass adoption

Unitree unveiled the $4,900 R1 robot.

This is a significantly low price compared with the existing robotics market.

The important point is not just its flashy movements.

Humanoids are now gradually moving down from “expensive exhibition equipment” into product categories that can be used in practice.

When AI becomes the brain of robots and is combined with low-cost hardware, the pace of field automation can accelerate further.

14. Security and bug bounties: AI has become too powerful in both offense and defense

The case of someone using AI to automatically find vulnerabilities in Google APIs and earning $500,000, or about 700 million won, in bug bounties over three months is shocking.

It proves that AI is powerful at detecting security vulnerabilities.

Cases such as leaks of private YouTube videos were also mentioned.

This is a warning for both creators and companies.

Going forward, security must move beyond the era of relying only on human eyes and shift to a structure where AI is used to block AI-powered attacks.

15. The rise of open source and agentic coding: the trend of Onis 1.0 and GLM 5.2

Onis 1.0 has appeared as an open-source LLM family specialized for agentic coding.

It showed fairly strong benchmark results and appeared capable of competing with existing top-tier models.

GLM 5.2 is also strengthening its direction toward running on smaller devices through official quantization collaboration with NVIDIA.

This trend is clear.

We are moving from an era where only the cloud was strong to an era where local, edge, and open-source systems are becoming strong together.

16. The most important point that other news often does not cover

The real core point of this week’s news is not the performance of individual models.

The most important point is that “the value of AI has moved from the model itself to the operating structure.”

In other words, the winners going forward will not be single chatbots, but companies that combine models, chips, Slack, game engines, video tools, robots, and security tools to automate real workflows.

The second core point is the strengthening of U.S. government review.

AI is no longer just technology; it is geopolitics.

The third core point is the cost structure.

Inference chips, memory, GPU prices, and hardware prices are ultimately passed on to users.

The fourth core point is the multimodel strategy.

The era of believing in one model like a god is over, and the era of combining multiple models depending on the situation has arrived.

Those who understand this trend first will have a much greater advantage in practice, investment, and startups.

< Summary >

GPT‑5.6 has entered a limited release stage where “who gets to use it first” matters more than performance.

Bidi 1 and Wan Streamer show that voice and video AI have evolved to the level of real-time conversation.

Claude Tag and Sakana AI signal that collaborative agent structures have become more important than a single model.

Seedance 2.5, Krea 2, and Unity AI are making content production automation faster and cheaper.

Apple’s price increases and transition to M7 show the reality that AI costs are being reflected in hardware prices.

Ultimately, the core point of this week’s AI trend is not a “model race,” but an “operating system race.”

[Related Articles…]

How Claude Is Changing Workplace Collaboration and the Real Meaning of AI Agents Inside Slack

How Unity AI Is Transforming Game Development and Why Real Production Pipelines Matter

*Source: 조코딩 JoCoding