One of the big new features of Windows Phone 7.5 (code named Mango) is direct camera access. Having access to the camera stream enables several new scenarios, such as in-app QR and barcode scanning, augmented reality applications or custom looking photo camera UI. There are some good resources on how to use the new APIs to access the camera in Mango, but I haven’t found any complete examples of how to implement a QR code scanner, which is what I am going to do in this blog post.
Mango adds two API to access the camera. You can either use the webcam APIs introduced in Silverlight 4, or you can use the new PhotoCamera class introduced in Mango. I’m not going to cover all the details of the two APIs; instead I will focus on how to solve the scenario of implementing a QR scanner. If you want to learn more about the Silverlight 4 APIs for accessing the camera I would highly recommend Rene Schulte’s detailed blog post. For information on the PhotoCamera API I would recommend this MSDN article showing how to build a camera application for Windows Phone.
When starting on the task of building a QR scanner one of the first things I did was to research different open source libraries to do the image recognition. Thankfully there are multiple libraries available, but the ZXing library from Google seems to be one of the most popular. The ZXing library is ported to multiple programming libraries, and supports a wide variety of 1D/2D codes (QR, Code128, Code39, EAN and many more). There is a C# port of the library, which has later been ported to Silverlight and Windows Phone 7. So the actual task of writing the QR scanner is to integrate the ZXing library with the Mango camera API.
The XAML code for the scanner consists of 4 main components. A Rectangle is used as to display the viewfinder video stream. The way you hook the camera up to the rectangle is through a VideoBrush. A VidoBrush let you paint any XAML element using a video source of some kind. To control orientation (depending on landscape or portrait) a CompositeTransform is added to the VideoBrush. This transform lets us programmatically set the rotation of the video brush to match the rotation of the camera. The final component is a ListBox used to display the text decoded from the QR codes.
In the code behind file we need to add some initialization code. In the constructor we instantiate an ObservableCollection which will be the ItemsSource of the ListBox. Whenever a QR code is successfully scanned the result is added the collection, which in turn will update items in the ListBox. We also create a DispatcherTimer that will execute every 250 millisecond. On each tick we will try to decode the image for any QR codes. In the OnNavgiatedTo method we instantiate the PhotoCamera class and hook the Initialized event for the camera. We also set the source of the VideoBruch created in XAML to the photo camera. This will start drawing whatever the camera sees on the Rectangle. We also hook the ShutterKeyHalfPressed event to focus.
When the photo camera is initialized we run some more setup code. The code needs to access the width and height of the preview resolution, so it cannot run before the camera is initialized. Next we create an instance of PhotoCameraLumianceSource. This is a custom class extending the LumianceSource class from the ZXing library. This class is responsible for extracting luminance data from any image. Since you could have different image formats, ZXing has mad the framework extensible by providing a base class you can extend with the code needed to get luminance data from the specific image format you are working with. Next we create an instance of the QRCodeReader. This is the class responsible for decoding the image to scan for a QR code. The ZXing library comes with multiple readers, so if you want to read Code128 bar codes you simply create a Code128Reader instead. Finally we set the rotation of the preview VideoBrush and start the timer.
Every time the timer ticks the ScanPreviewBuffer is executed. This method is responsible for getting the image data from the photo camera and tries to decode it. The photo camera offers multiple methods to get images. You can either capture a full resolution image, or you can get the preview buffer. Capturing a full resolution image would be too slow for a computer vision application, but thankfully we got methods to access the preview buffer. You can get the preview buffer as either ARGB format or YCbCr format. The camera sensor is internally using YCbCr, so getting it as Argb format would require conversion. If you only care about luminance (the Y in YCbCr) there is a convenient method giving you the Y component of the YCbCr format. This is perfect as this is exactly what we need to pass to the ZXing library. Rene Schutle got a good blog post going into the details of Argb vs YCbCr.
The GetPreviewBufferY method takes a byte array as its parameter and will populate this byte array with the luminance data from the preview buffer. We pass in the PreviewBufferY property from our luminance source class (which will be explained later). Once we have captured the luminance data we create a HybridBinarizer and a BinaryBitmap. They are part of the ZXing library. I haven’t worked enough with ZXing yet to fully understand the architecture, but these classes are steps the luminance data pass through before being passed to the decode method of the QRCodeReader. If the decode is successful the decoded text is added to the ObservableCollection, which in turn will update the ListBox. The QRCodeReader decode method will throw an exception if it is not able to decode the image, so we need to wrap the code in a try-catch block.
The PhotoCameraLumianceSoure implements the LumianceSource base class used by the ZXing library. The class is responsible for exposing the luminance data for the image format you are working with. Conveniently the PhotoCamera class gives us the luminance data directly, so implementing the class does not take much code. There are basically one property and one method you need to implement. The Matrix property simply returns the luminance data for the complete image. The getRow method returns the luminance data for a given row. Both the property and the method simply read the PreviewBufferY byte array which is filled every time the timer calls the GetPreviewBufferY method.
So how does this work in practice? Below is a YouTube video demonstrating the QR scanner. As you can see it’s fairly quick and accurate (well, at least on bright images of my computer screen). I have not yet tested it in the wild, and the ZXing library offers many customizations to fine tune how you want the scanner to behave, but at least this demonstrates how you can combine ZXing and the Mango PhotoCamera API to implement a QR scanner in less than 150 lines of user code!
The complete QR code scanner example code is available on github.