“Apple ra mắt Depth Pro, mô hình trí tuệ nhân tạo đột phá trong thị giác 3D”

Ngày hôm nay, Apple đã phát hành Depth Pro, một mô hình AI mới mà làm thay đổi quy tắc về thị giác 3D. Đây là một bước tiến lớn trong công nghệ AI, có thể thay đổi cách các máy móc nhìn thấy độ sâu, có thể biến đổi các ngành công nghiệp từ thực tế ảo đến xe tự lái. Mời bạn đăng ký bản tin hàng ngày và hàng tuần để cập nhật tin tức mới nhất và nội dung độc quyền về phần tin tức về AI hàng đầu. #AppleDepthPro #AIVision Nhóm nghiên cứu AI của Apple đã phát triển một mô hình mới có thể cải thiện đáng kể cách máy móc nhìn thấy độ sâu, có thể biến đổi các ngành công nghiệp từ thực tế ảo đến xe tự lái. Hệ thống này, được gọi là Depth Pro, có khả năng tạo ra bản đồ độ sâu 3D chi tiết từ một hình ảnh 2D trong một phần của giây—mà không cần dựa vào dữ liệu từ camera cần thiết truyền thống để làm các dự đoán như vậy. Công nghệ này, được mô tả trong một bài báo nghiên cứu có tựa đề “Depth Pro: Đo Chiều Sâu Metric Sắc Nét Trong Thời Gian Dưới Một Giây”, là một bước tiến lớn trong lĩnh vực ước lượng độ sâu monocular, một quá trình sử dụng chỉ một hình ảnh để suy luận độ sâu. Điều này có thể có các ứng dụng rộng rãi trong các ngành công nghiệp mà nhận biết không gian thời gian thực là quan trọng. #DepthProTechnology #AIAdvancements Hãy tham gia vào việc kích hoạt các ứng dụng thực tế: Từ tiệm đồ cũ tới xe tự lái. Dịch vụ của chúng tôi có vô số ứng dụng: từ thực tế ảo tới xe tự lái, thể hiện những “lợi thế nhân bản” giúp nhận diện không gian thời gian thực rõ ràng. #DepthProApplications #IndustryUses Với việc phát hành mở mã nguồn, Depth Pro có thể sớm trở thành một phần quan trọng trong các ngành công nghiệp từ xe tự lái đến thực tế ảo—biến đổi cách máy móc và con người tương tác với môi trường 3D. Hãy tham gia cùng chúng tôi trên nền công nghệ mới này và khám phá tiềm năng không giới hạn của nó trong tương lai! #OpenSourceTechnology #DepthProFuture Nguồn: https://venturebeat.com/ai/apple-releases-depth-pro-an-ai-model-that-rewrites-the-rules-of-3d-vision/

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Apple’s AI research team has developed a new model that could significantly advance how machines perceive depth, potentially transforming industries ranging from augmented reality to autonomous vehicles.

The system, called Depth Pro, is able to generate detailed 3D depth maps from single 2D images in a fraction of a second—without relying on the camera data traditionally needed to make such predictions.

The technology, detailed in a research paper titled Depth Pro: Sharp Monocular Metric Depth in Less Than a Second,” is a major leap forward in the field of monocular depth estimation, a process that uses just one image to infer depth.

This could have far-reaching applications across sectors where real-time spatial awareness is key. The model’s creators, led by Aleksei Bochkovskii and Vladlen Koltun, describe Depth Pro as one of the fastest and most accurate systems of its kind.

A comparison of depth maps from Apple’s Depth Pro, Marigold, Depth Anything v2, and Metric3D v2. Depth Pro excels in capturing fine details like fur and birdcage wires, producing sharp, high-resolution depth maps in just 0.3 seconds, outperforming other models in accuracy and detail. (credit: arxiv.org)

Monocular depth estimation has long been a challenging task, requiring either multiple images or metadata like focal lengths to accurately gauge depth.

But Depth Pro bypasses these requirements, producing high-resolution depth maps in just 0.3 seconds on a standard GPU. The model can create 2.25-megapixel maps with exceptional sharpness, capturing even minute details like hair and vegetation that are often overlooked by other methods.

“These characteristics are enabled by a number of technical contributions, including an efficient multi-scale vision transformer for dense prediction,” the researchers explain in their paper. This architecture allows the model to process both the overall context of an image and its finer details simultaneously—an enormous leap from slower, less precise models that came before it.

A comparison of depth maps from Apple’s Depth Pro, Depth Anything v2, Marigold, and Metric3D v2. Depth Pro excels in capturing fine details like the deer’s fur, windmill blades, and zebra’s stripes, delivering sharp, high-resolution depth maps in 0.3 seconds. (credit: arxiv.org)

Metric depth, zero-shot learning

What truly sets Depth Pro apart is its ability to estimate both relative and absolute depth, a capability called “metric depth.”

This means that the model can provide real-world measurements, which is essential for applications like augmented reality (AR), where virtual objects need to be placed in precise locations within physical spaces.

And Depth Pro doesn’t require extensive training on domain-specific datasets to make accurate predictions—a feature known as “zero-shot learning.” This makes the model highly versatile. It can be applied to a wide range of images, without the need for the camera-specific data usually required in depth estimation models.

“Depth Pro produces metric depth maps with absolute scale on arbitrary images ‘in the wild’ without requiring metadata such as camera intrinsics,” the authors explain. This flexibility opens up a world of possibilities, from enhancing AR experiences to improving autonomous vehicles’ ability to detect and navigate obstacles.

For those curious to experience Depth Pro firsthand, a live demo is available on the Hugging Face platform.

A comparison of depth estimation models across multiple datasets. Apple’s Depth Pro ranks highest overall with an average rank of 2.5, outperforming models like Depth Anything v2 and Metric3D in accuracy across diverse scenarios. (credit: arxiv.org)

Real-world applications: From e-commerce to autonomous vehicles

This versatility has significant implications for various industries. In e-commerce, for example, Depth Pro could allow consumers to see how furniture fits in their home by simply pointing their phone’s camera at the room. In the automotive industry, the ability to generate real-time, high-resolution depth maps from a single camera could improve how self-driving cars perceive their environment, boosting navigation and safety.

“The method should ideally produce metric depth maps in this zero-shot regime to accurately reproduce object shapes, scene layouts, and absolute scales,” the researchers write, emphasizing the model’s potential to reduce the time and cost associated with training more conventional AI models.

Tackling the challenges of depth estimation

One of the toughest challenges in depth estimation is handling what are known as “flying pixels”—pixels that appear to float in mid-air due to errors in depth mapping. Depth Pro tackles this issue head-on, making it particularly effective for applications like 3D reconstruction and virtual environments, where accuracy is paramount.

Additionally, Depth Pro excels in boundary tracing, outperforming previous models in sharply delineating objects and their edges. The researchers claim it surpasses other systems “by a multiplicative factor in boundary accuracy,” which is key for applications that require precise object segmentation, such as image matting and medical imaging.

Open-source and ready to scale

In a move that could accelerate its adoption, Apple has made Depth Pro open-source. The code, along with pre-trained model weights, is available on GitHub, allowing developers and researchers to experiment with and further refine the technology. The repository includes everything from the model’s architecture to pretrained checkpoints, making it easy for others to build on Apple’s work.

The research team is also encouraging further exploration of Depth Pro’s potential in fields like robotics, manufacturing, and healthcare. “We release code and weights at https://github.com/apple/ml-depth-pro,” the authors write, signaling this as just the beginning for the model.

What’s next for AI depth perception

As artificial intelligence continues to push the boundaries of what’s possible, Depth Pro sets a new standard in speed and accuracy for monocular depth estimation. Its ability to generate high-quality, real-time depth maps from a single image could have wide-ranging effects across industries that rely on spatial awareness.

In a world where AI is increasingly central to decision-making and product development, Depth Pro exemplifies how cutting-edge research can translate into practical, real-world solutions. Whether it’s improving how machines perceive their surroundings or enhancing consumer experiences, the potential uses for Depth Pro are broad and varied.

As the researchers conclude, “Depth Pro dramatically outperforms all prior work in sharp delineation of object boundaries, including fine structures such as hair, fur, and vegetation.” With its open-source release, Depth Pro could soon become integral to industries ranging from autonomous driving to augmented reality—transforming how machines and people interact with 3D environments.

[ad_2]

Leave a Reply

Your email address will not be published. Required fields are marked *