Why pick-by-vision will disrupt voice picking

In the past decades, the logistics industry has demonstrated its willingness to rapidly adopt groundbreaking technologies. With many new technologies lurking around the corner, it is a good time to zoom in on the emerging pick-by-vision technology as it has the potential to deliver the best performance breakthrough and highest return on investment. Although they share a similar DNA, pick-by-vision offers various features that will ultimately disruptively replace voice picking.

The situation today in warehousing

Today, warehousing operations account for about 20% of all logistics costs. Picking accounts for 55-65% of the total warehousing costs. To be deemed viable, any new technology should aim to cut this cost. It currently appears that robotics will not be able do the trick. The high variety of goods doesn’t suit machines as they simply do not have the same level of flexibility and fine motor skills as humans do. So humans will still be the centerpiece of picking and warehousing operations in the next decade.

Two trends will further decide the fate of new technologies: a sector-specific seasonal demand fluctuation and an ageing workforce. They challenge European distribution centers to efficiently train and deploy both interim workers and new operators. Any new technology will need to be easy to use for new employees.

Although logistics is at the forefront in adapting new technologies, the vast majority of warehouses in the world still rely on the pick-by-paper approach. Using paper in warehouse operations is typically error prone and slow. RF Scanners, Pick to Light and Voice Picking have all tried to eliminate paper with varying degrees of success. RF scanners are handhelds that make it possible to read barcodes, enabling real time data collection when operators perform tasks in a warehouse. The technology has been widely adopted (60% market penetration) over the past 3 decades and mostly complements a paper-based picking process. RF scanners have dramatically improved the productivity of a paper-based picking process (100-200 lines per hour). There is, however, substantial room for further improvement (>50%), as the scanner prevents a hands-free working approach. The two other technologies have embraced hands-free working.

Pick to Light systems consist of light displays that are installed per location in shelving units, case flow racks and storage racks. Order picking tasks make the display units light up one at a time as operators pick each order line. Pick to light technology is constrained by a location configuration. It is rather inflexible and 40-80% more expensive than voice picking solutions. While offering the highest productivity (up to 350 lines per hour), this technology does not enable data capture: e.g. confirmation or checks. Pick to light is designed for high reach density picking of a limited number of fast moving items and has less than 20% market penetration in large warehouse operations.

Voice picking systems allow operators to communicate orally with a software platform to receive and confirm picking tasks. The solution consists of headsets, voice-only-wearable (VOW) terminals and a software platform.

Because of its hands-free approach, voice picking systems outperform RF scanners by over 50% in terms of productivity but at a cost that can be up to a 100% more expensive than RF scanning solutions. The payback of Voice Picking investments in low reach density (few reaches relative to travel distance) warehouses can range from a few months to a year. However, the ability for data capture is far more limited for Voice Picking technology in comparison to RF scanning. Over the past decade voice solutions have continued to build up to 30% market share in larger warehouses with a complex product range.

Pick-by-vision: the new kid on the block

Voice picking specifically is expected to be challenged by a new emerging technology: pick-by-vision. It relies on augmented or assisted reality smart eyewear (e.g. Glass Enterprise, Iristick.Z1, Vuzix M300) to display picking instructions to the operator. Pick-by-Vision solutions and voice picking technology target the same low reach density picking operations. Both solutions offer hands-free auditive instructions to the picker. While benchmark studies have demonstrated a similar productivity level for both technologies, assisted reality glasses promise a higher picking accuracy.

In addition, smart eyewear solutions have 4 distinctive features that will disrupt and replace voice-only technology over time:

  1. The head-mounted display of smart glasses provides visual instructions or information directly in the wearer’s field of view. As vision is the most important human sense, workers can orient themselves better in the warehouse when receiving optical support. 
  2. Cameras embedded in the smart glasses make it possible to directly scan product and warehouse location barcodes in the picker’s field of view, improving the transaction accuracy. 
  3. Unlike voice-only systems, smart glasses are far less sensitive to the typical voice recognition challenges: background noise, additional software complexity and training path. The power of the visual information is emphasized by the Chinese proverb “One picture is worth ten thousand words”. 
  4. A distinct cost and ROI advantage for a pick-by-Vision solution is highlighted by benchmarking the various picking solutions in a typical warehouse configuration. As smart glasses have a much broader market potential (e.g. remote assistance, work instructions) in comparison to the dedicated voice-picking solutions, economies of scale and Moore’s Law are expected to increase their cost advantage.

In the coming years, we expect voice-picking to be fully replaced by pick-by-vision. The combination of auditive and visual support give pick-by-vision a clear edge over voice-picking.