Jrobot Self Drive is another self-drive experiment based on machine learning, it is not a simulator, it is not a road vehicle, it is a footpath traveler. We built NVIDIA CNN self-drive model using Keras, collected training data, trained the model, and converted the trained model to TensorFlow Lite.
TensorFlow Lite allows us to do inference on a mobile device and is the key part of this project. We added TensorFlow Lite to Jrobot Android app. When running, TensorFlow Lite is able to load the trained model, take a camera image as input and give a steering angle as output. Jrobot app runs on an Android phone (Xiaomi Mi5) sitting in the phone box on Jrobot car and control the movement of the Jrobot car through bluetooth connection with Arduino on the car.
We did road test in 2 places in the neighborhood and the results show us the trained model works well. Even though it is not full self-drive, it makes human control so much easier, and opens up so many new options, which means there is so much more to do. Thank you!
Take the example from tensorflow document. Since the current iteration depends on the last iteration through the `i` in the code. I am wondering if the loop body executes one by one as a pure sequential loop?
i = tf.constant(0)
c = lambda i: tf.less(i, 10)
b = lambda i: tf.add(i, 1)
r = tf.while_loop(c, b, [i])
Two op instances from different iterations can be computed simultaneously, as long as there is no direct or indirect data dependency or control dependency between them. The iterations in this while_loop can be paralleled. In some cases, iterations in a while_loop cannot be paralleled, typically when there are multiple iteration variable.
The success of deep learning in computer vision can be partially attributed to the availability of large amounts of labeled training data — a model’s performance typically improves as you increase the quality, diversity and the amount of training data. However, collecting enough quality data to train a model to perform well is often prohibitively difficult. One way around this is to hardcode image symmetries into neural network architectures so they perform better or have experts manually design data augmentation methods, like rotation and flipping, that are commonly used to train well-performing vision models. However, until recently, less attention has been paid to finding ways to automatically augment existing data using machine learning. Inspired by the results of our AutoML efforts to design neural network architectures and optimizers to replace components of systems that were previously human designed, we asked ourselves: can we also automate the procedure of data augmentation?
n “AutoAugment: Learning Augmentation Policies from Data”, we explore a reinforcement learning algorithm which increases both the amount and diversity of data in an existing training dataset. Intuitively, data augmentation is used to teach a model about image invariances in the data domain in a way that makes a neural network invariant to these important symmetries, thus improving its performance. Unlike previous state-of-the-art deep learning models that used hand-designed data augmentation policies, we used reinforcement learning to find the optimal image transformation policies from the data itself. The result improved performance of computer vision models without relying on the production of new and ever expanding datasets.
Augmenting Training Data The idea behind data augmentation is simple: images have many symmetries that don’t change the information present in the image. For example, the mirror reflection of a dog is still a dog. While some of these “invariances” are obvious to humans, many are not. For example, the mixup method augments data by placing images on top of each other during training, resulting in data which improves neural network performance.
AutoAugment is an automatic way to design custom data augmentation policies for computer vision datasets, e.g., guiding the selection of basic image transformation operations, such as flipping an image horizontally/vertically, rotating an image, changing the color of an image, etc. AutoAugment not only predicts what image transformations to combine, but also the per-image probability and magnitude of the transformation used, so that the image is not always manipulated in the same way. AutoAugment is able to select an optimal policy from a search space of 2.9 x 1032 image transformation possibilities.
AutoAugment是能為電腦視覺資料集設計自定義的資料擴增策略的一種自動化方法。例如，導引如何選擇基本圖像的轉換作業，像是水平或垂直翻轉圖像、旋轉圖像及改變圖像的顏色⋯⋯等等。AutoAugment不僅能預測到要結合哪一種圖像轉換法，還能預測每張圖像的概率，還有圖像轉換的強度，所以圖像不再使用同一種方法來調整。自動擴增技術能從高達2.9 x 1032 種圖像轉換可能性的搜尋空間中找到一個最佳策略。
AutoAugment learns different transformations depending on what dataset it is run on. For example, for images involving street view of house numbers (SVHN) which include natural scene images of digits, AutoAugment focuses on geometric transforms like shearing and translation, which represent distortions commonly observed in this dataset. In addition, AutoAugment has learned to completely invert colors which naturally occur in the original SVHN dataset, given the diversity of different building and house numbers materials in the world.
AutoAugment能根據正在處理的資料集，學習到不同的圖像轉換方式。例如，針對street view of house numbers（SVHN）這套資料集中的圖像，因其中包含真實場景圖像中的各種數字，自動擴增技術便聚焦於幾何轉換，像是歪斜和位移，代表這個資料集中常見的資料失真情況。此外，自動擴增技術已完整學會如何將顏色反轉，而這樣的情況在原始的SVHN資料集中相當常見，原因是考慮到真實世界中的各種不同的建築物和房屋門牌號碼材質。
On CIFAR-10 and ImageNet, AutoAugment does not use shearing because these datasets generally do not include images of sheared objects, nor does it invert colors completely as these transformations would lead to unrealistic images. Instead, AutoAugment focuses on slightly adjusting the color and hue distribution, while preserving the general color properties. This suggests that the actual colors of objects in CIFAR-10 and ImageNet are important, whereas on SVHN only the relative colors are important.
Results Our AutoAugment algorithm found augmentation policies for some of the most well-known computer vision datasets that, when incorporated into the training of the neural network, led to state-of-the-art accuracies. By augmenting ImageNet data we obtain a new state-of-the-art accuracy of 83.54% top1 accuracy and on CIFAR10 we achieve an error rate of 1.48%, which is a 0.83% improvement over the default data augmentation designed by scientists. On SVHN, we improved the state-of-the-art error from 1.30% to 1.02%. Importantly, AutoAugment policies are found to be transferable — the policy found for the ImageNet dataset could also be applied to other vision datasets (Stanford Cars, FGVC-Aircraft, etc.), which in turn improves neural network performance.
We are pleased to see that our AutoAugment algorithm achieved this level of performance on many different competitive computer vision datasets and look forward to seeing future applications of this technology across more computer vision tasks and even in other domains such as audio processing or language models. The policies with the best performance are included in the appendix of the paper, so that researchers can use them to improve their models on relevant vision tasks.