Tuesday, November 19, 2019

OpenCV

The latest OpenCV 3.4.3 (open source computer vision framework) work with Python 3.7.

OpenCV supports the C/ C++, Python, and Java languages, and it can be used to build computer vision applications for desktop and mobile operating systems alike, including Windows, Linux, macOS, Android, and iOS.

OpenCV started at Intel Research Lab during an initiative to advance approaches for building CPU-intensive applications.

How to Get OpenCV works for Python
Install Python 3.7 x64
then
pip install "numpy-1.14.6+mkl-cp37-cp37m-win_ amd64.whl"
pip install "opencv_python-3.4.3+contrib-cp37- cp37m-win_amd64.whl"

Validate OpenCV installation by run import command
>import cv2

Install OpenCV on MAC
brew install python
pip install numpy
brew install opencv --with-tbb --with-opengl

OpenCV consists of two types of modules:
- Main modules: Provide the core functionalities such as image processing tasks, filtering, transformation, and others.

- Extra modules: These modules do not come by default with the OpenCV distribution. These modules are related to additional computer vision functionalities such as text recognition.



List of Open Main Modules
core Includes all core OpenCV functionalities such as basic structures, Mat classes, and so on.

imgproc Includes image-processing features such as transformations, manipulations, filtering, and so on.

Imgcodecs Includes functions for reading and writing images.

videoio Includes functions for reading and writing videos.

highgui Includes functions for GUI creation to visualize results.

video Includes video analysis functions such as motion detection and tracking, the Kalman filter, and the infamous CAM Shift algorithm (used for object tracking).

calib3d Includes calibration and 3D reconstruction functions that are used for the estimation of transformation between two images.

features2d Includes functions for keypoint-detection and descriptor-extraction algorithms that are used in object detection and categorization algorithms.

objdetect Supports object detection.

dnn Used for object detection and classification purposes, among others. The dnn module is relatively new in the list of main modules and has support for deep learning.

ml Includes functions for classification and regression and covers most of the machine learning capabilities.

flann Supports optimized algorithms that deal with the nearest neighbor search of high-dimensional features in large data sets. FLANN stands for Fast Library for Approximate Nearest Neighbors (FLANN).

photo Includes functions for photography-related computer vision such as removing noise, creating HD images, and so on.

stitching Includes functions for image stitching that further uses concepts such as rotation estimation and image warping.

shape Includes functions that deal with shape transformation, matching, and distance-related topics.

superres Includes algorithms that handle resolution and enhancement.

videostab Includes algorithms used for video stabilization.

viz Display widgets in a 3D visualization window


OpenCV Sample Code

Task 1 : Read image convert it to gray, show two images, and save gray scale image to HD

import cv2

Original_image = cv2.imread("./images/panda.jpg")
gray_image = cv2.cvtColor(Original_image, cv2.COLOR_BGR2GRAY)

cv2.imshow("Gray panda", gray_image)
cv2.imshow("Color panda", gray_image)

cv2.imwrite("gray_panda", gray_image)
cv2.waitKey(0)
cv2.destroyAllWindows()



Task 2 :  Open user camera and read image by image and show on screen, exit when user press esc 

import cv2
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
cv2.imshow("frame", frame)
key = cv2.waitKey(1)
if key == 27:
break

cap.release()
cv2.destroyAllWindows()


Task 3: Open Video Stream and show image by image until user press esc

import cv2
mountains_video = cv2.VideoCapture("mountains.mp4")
while True:
ret, frame = mountains_video.read()
cv2.imshow("frame", frame)
key = cv2.waitKey(25)
if key == 27:
break
mountains_video.release()


Task 4: Save Camera Stream after flip to HD and flip images show image by image until user press q

import numpy
import cv2
cap = cv2.VideoCapture(0)

# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))
while(cap.isOpened()):
    ret, frame = cap.read()
    if ret==True:
        frame = cv2.flip(frame,0)
        # write the flipped frame
        out.write(frame)
        cv2.imshow('frame',frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        break

cap.release()
out.release()
cv2.destroyAllWindows()






Notes

OpenCV does not provide any way to train a DNN. However, you can train a DNN model using frameworks like Tensorflow, MxNet, Caffe etc, and import it into OpenCV for your application.

OpenVINO is specifically designed to speed up networks used in visual tasks like image classification and object detection.


When we think of AI, we usually think about companies like IBM, Google, Facebook.. etc.
Well, they are indeed leading the way in algorithms but AI is computationally expensive during training as well as inference.
Therefore, it is equally important to understand the role of hardware companies in the rise of AI.

NVIDIA provides the best GPUs as well as the best software support using CUDA and cuDNN for Deep Learning.
NVIDIA pretty much owns the market for Deep Learning when it comes to training a neural network.

However, GPUs are expensive and not always necessary for inference (inference means use trained model on production).
In fact, most of the inference in the world is done on CPUs!

In the inference space, Intel is a big player, it manufactures Vision Processing Units (VPUs), integrated GPUs, and FPGAs — all of which can be used for inference.

and to avoid confusing developers about how to write code to optimize the use of HW, Intel provides us OpenVINO framework

OpenVINO enables CNN-based deep learning inference on the edge, supports heterogeneous execution across computer vision accelerators, speeds time to market via a library of functions and pre-optimized kernels and includes optimized calls for OpenCV and OpenVX.


How to use OpenVINO?

1) OpenCV or OpenVINO does not provide you tools to train a neural network. So, train your model using Tensorflow or pytorch.
2) The model obtained in the previous step is usually not optimized for performance.
   OpenVINO requires us to create an optimized model which they call Intermediate Representation (IR) using a Model Optimizer tool they provide.
 
   The result of the optimization process is an IR model. The model is split into two files

   model.xml : This XML file contains the network architecture.
   model.bin : This binary file contains the weights and biases.
3) OpenVINO Inference Engine plugin : OpenVINO optimizes running this model on specific hardware through the Inference Engine plugin

No comments: