NVIDIA Unveils New AI Models for Robotics, Driving, and Virtual Agents at CVPR 2026

Three domains of physical AI

" -> "

ফিজিক্যাল এআই-এর তিনটি ডোমেইন

" Then content: "The models cover tasks that have long been tough for robots and self-driving systems. One model focuses on grasping — the ability for a robot arm to pick up unfamiliar objects without crushing them or dropping them. Another is built for autonomous driving, handling perception and decision-making in traffic. The third targets virtual agents, which companies train in simulated worlds before letting them loose in real settings." Bengali: "মডেলগুলি এমন কাজগুলি কভার করে যা দীর্ঘদিন ধরে রোবট এবং স্ব-চালিত সিস্টেমের জন্য কঠিন ছিল। একটি মডেল গ্রাসিং-এর উপর ফোকাস করে — একটি রোবট আর্মের অপরিচিত বস্তুগুলোকে চূর্ণ না করে বা ফেলে না দিয়ে তুলে নেওয়ার ক্ষমতা। আরেকটি স্বয়ংক্রিয় ড্রাইভিং-এর জন্য তৈরি, যা ট্রাফিকে উপলব্ধি এবং সিদ্ধান্ত গ্রহণ পরিচালনা করে। তৃতীয়টি ভার্চুয়াল এজেন্টদের লক্ষ্য করে, যেগুলো কোম্পানিগুলো বাস্তব পরিবেশে ছাড়ার আগে সিমুলেটেড জগতে প্রশিক্ষণ দেয়।" Then next paragraph: "NVIDIA didn't release detailed performance benchmarks, but the announcement signals it sees these three areas as the main bottlenecks for physical AI. The company has been investing heavily in robotics chips, simulation platforms like Isaac Sim, and in-car compute systems. These models tie those hardware efforts to a software layer." Bengali: "এনভিআইডিআইএ বিস্তারিত পারফরম্যান্স বেঞ্চমার্ক প্রকাশ করেনি, তবে ঘোষণাটি ইঙ্গিত দেয় যে এটি এই তিনটি ক্ষেত্রকে ফিজিক্যাল এআই-এর প্রধান বাধা হিসেবে দেখে। কোম্পানিটি রোবোটিকস চিপস, আইজ্যাক সিমের মতো সিমুলেশন প্ল্যাটফর্ম এবং গাড়ির কম্পিউট সিস্টেমে ব্যাপকভাবে বিনিয়োগ করছে। এই মডেলগুলি সেই হার্ডওয়্যার প্রচেষ্টাগুলিকে একটি সফটওয়্যার স্তরের সাথে যুক্ত করে।" Next h2: "

Why scaling matters

" -> "

কেন স্কেলিং গুরুত্বপূর্ণ

" Then paragraph: "Training a robot to pick up a water bottle is one thing. Teaching it to pick up any bottle — regardless of shape, lighting, or angle — at a speed that a warehouse needs is another. The same goes for a self-driving car that has to handle a snowy night in Detroit or a chaotic intersection in Mumbai. NVIDIA's pitch is that its new models can scale across those variations without retraining from scratch." Bengali: "একটি রোবটকে একটি জলের বোতল তুলতে প্রশিক্ষণ দেওয়া এক জিনিস। এটিকে যে কোনো বোতল তুলতে শেখানো — আকৃতি, আলো বা কোণ নির্বিশেষে — এমন গতিতে যা একটি গুদামের প্রয়োজন, তা আরেক জিনিস। একই কথা প্রযোজ্য একটি স্ব-চালিত গাড়ির জন্য যা ডেট্রয়েটের তুষারময় রাত বা মুম্বাইয়ের বিশৃঙ্খল মোড় সামলাতে হবে। এনভিআইডিআইএর বক্তব্য হল যে এর নতুন মডেলগুলি স্ক্র্যাচ থেকে পুনরায় প্রশিক্ষণ ছাড়াই সেই বৈচিত্র্যগুলির মধ্যে স্কেল করতে পারে।" Next: "The virtual agent model is aimed at companies building digital twins or training AI assistants. Instead of scripting every interaction, the model lets the agent learn by doing inside a simulated environment. That approach has become popular in logistics and gaming, but NVIDIA wants to push it into manufacturing and healthcare." Bengali: "ভার্চুয়াল এজেন্ট মডেলটি ডিজিটাল টুইন তৈরি করা বা এআই সহায়কদের প্রশিক্ষণ দেওয়া কোম্পানিগুলোর জন্য। প্রতিটি মিথস্ক্রিয়া স্ক্রিপ্ট করার পরিবর্তে, মডেলটি এজেন্টকে একটি সিমুলেটেড পরিবেশের মধ্যে করে শিখতে দেয়। সেই পদ্ধতিটি লজিস্টিকস এবং গেমিংয়ে জনপ্রিয় হয়েছে, কিন্তু এনভিআইডিআইএ এটিকে ম্যানুফ্যাকচারিং এবং হেলথকেয়ারে ঠেলে দিতে চায়।" Next h2: "

A conference focused on vision

" -> "

ভিশনে দৃষ্টি নিবদ্ধ একটি সম্মেলন

" Next paragraph: "CVPR — the Conference on Computer Vision and Pattern Recognition — is the biggest annual gathering for computer vision researchers. It's a natural venue for NVIDIA to present work on perception and control. The company has been a regular at the event, often using it to debut hardware or open-source tools. This year, the emphasis was on models that bridge the gap between seeing and doing." Bengali: "সিভিপিআর — কম্পিউটার ভিশন অ্যান্ড প্যাটার্ন রিকগনিশন কনফারেন্স — কম্পিউটার ভিশন গবেষকদের জন্য সবচেয়ে বড় বার্ষিক সমাবেশ। এটি এনভিআইডিআইএ-র জন্য উপলব্ধি এবং নিয়ন্ত্রণ সম্পর্কিত কাজ উপস্থাপনের একটি স্বাভাবিক স্থান। কোম্পানিটি এই ইভেন্টে নিয়মিত অংশগ্রহণ করে, প্রায়শই এটি হার্ডওয়্যার বা ও

Three domains of physical AI

ফিজিক্যাল এআই-এর তিনটি ডোমেইন

Why scaling matters

কেন স্কেলিং গুরুত্বপূর্ণ

A conference focused on vision

ভিশনে দৃষ্টি নিবদ্ধ একটি সম্মেলন

Related Articles