Inside Amazon's new 'Just Walk Out': AI transformers meets edge computing

Inside Amazon's new 'Just Walk Out': AI transformers meets edge computing

Subscribe to our newsletters for the latest AI updates. Learn More

On the first floor of an industrial building, we visit Amazon's secretive lab to explore the latest Just Walk Out (JWO) technology.

Now in over 170 locations globally, JWO lets customers shop and leave without stopping at a cashier, making shopping faster.

We are here to see Amazon's new AI-based system. This uses multi-modal models and transformer-based learning to analyze data from various store sensors. Similar to GPT models that generate text, these models generate receipts. This upgrade increases accuracy and makes deployment easier for retailers.

Our guide is Jon Jenkins (JJ), VP of JWO at Amazon. He takes us past groups of Amazon employees, through security, and into a replica of a local bodega filled with typical store items.

Besides electronic gates and Amazon's specialized cameras above, the lab looks like an ordinary store minus the cashier.

Photo: No photos in the lab, but here's a real JWO store across the square.

How JWO works

JWO (pronounced "jay-woh") uses computer vision, sensor fusion, and machine learning to track items in the store. The setup starts with creating a 3D map using an iPhone or iPad.

The store is divided into product areas called "polygons." Custom cameras and weight sensors are installed in these areas.

Photo: In a real JWO store, cameras and sensors are above the shopping areas.

JWO tracks head and hand movements to detect interactions with polygons. By combining data from cameras and sensors, the models accurately predict item retention.

JJ explains, "We used to run multiple models in sequence, which was slower and costlier."

Now, a single transformer model processes all the data and generates a receipt in one go. This method is similar to GPT models but generates receipts instead of text.

Image: JWO Architecture courtesy Amazon.

The improved model handles complex scenarios and minimizes receipt delays, making deployment easier for retailers.

The AI adapts to store layout changes and identifies items even when misplaced, offering a reliable shopping experience.

JWO is powered by edge computing

Amazon uses edge computing in JWO. All model inference happens on local hardware, managed by Amazon and included in the service cost.

"We built edge devices for on-site reasoning, which is faster and needs less bandwidth," said JJ.

The new edge computing hardware features a rail-mounted enclosure with a large air intake. Amazon hasn't revealed its exact components but speculates it includes Amazon's custom GPUs.

Edge computing processes and fuses data from multiple sensors in real-time, essential for JWO's efficiency.

Scaling up with RFID

In another lab, we see JWO integrated with RFID technology. Here, items have unique RFID tags, simplifying the setup.

This lab resembles a retail clothier with racks of tagged apparel. The AI architecture remains the same but without multiple cameras and weight sensors.

RFID integration reduces infrastructure needs, making JWO suitable for temporary retail settings like festivals.

Building JWO

Announced publicly in 2018, JWO's R&D likely began earlier. JJ didn't disclose team size or investment costs.

However, LinkedIn suggests the team has 250-1000 members. Based on median compensation, the R&D cost could be between $250M-$800M.

This highlights the significant investment needed to develop a similar system.

The build-vs-buy dilemma in AI

Developing an AI system like JWO is costly and risky. Many enterprises opt for pre-integrated systems to avoid rapid obsolescence.

Hyperscale providers like Amazon dominate AI due to high complexity and costs. Retailers aim for efficiency and ROI, likely choosing ready-made solutions like JWO.

Customization through services like AWS Bedrock could expand JWO's capabilities.

Toward widespread adoption of AI

Advances in JWO AI models reflect the impact of transformers across AI. These models excel in processing multi-modal data in real-time.

Amazon's strategy leverages its strengths, offering JWO as a service through AWS, solving retail pain points and expanding market dominance.

RFID integration offers minimal infrastructure and the potential for mass adoption.

As AI and edge computing evolve, JWO highlights how hyperscalers shape retail's future. Easily deployable AI services can drive broader adoption in everyday businesses.