0% found this document useful (0 votes)
8 views23 pages

CV1 Introduction

Uploaded by

spacemankevinh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views23 pages

CV1 Introduction

Uploaded by

spacemankevinh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

计算机视觉

Computer Vision

-- Introduction

钟凡
[email protected]
课程规划
 讲授相对不变的基本原理;( 40% )

 调研某个前沿技术方向,并完成报告;( 30% )
 每位同学独立完成一个书面报告 ( 不求全,但要有一定深度,有实际场景的测试分析 )
 每个方向的所有同学合作完成一个口头报告(综述,技术发展路线和代表性方法,测试分析结果等)

 实验: 10 个小实验 or 一个大作业 二选一 ( 30% )


 个人完成
 都指定题目
参考书

 计算机视觉—算法与应用
 【作者】 Richard Szeliski
 【出 版 社】 清华大学出版社
计算机视觉导论
图像表示
Image File Formats
 Vector images ( .ai, .eps, .ps, … )
 No aliasing and blur when scaling;
 Difficult to be obtained, limited applications in practice;

draw circle
center 0.5, 0.5
radius 0.4
fill-color yellow
stroke-color black
stroke-width 0.05
draw circle
center 0.35, 0.4
radius 0.05
fill-color black
…………
Image File Formats

 Bitmap ( .bmp, .jpg, .png, .gif,… )


 Easy to get, wide applications;
 Becomes blur and aliased when scaling ;

a
光栅化 (rasterize)

Vector -> Bitmap


GIF — Graphics Interchange Format
 8-bit indexed, can be saved with a maximum of 256 colors;
 having the option to dither– (will mix pixels of two different
available colors to create a suggestion of another color)
 can be animated . transparent.

The superiority:
its small size & high quality
JPEG — Joint Photographic Experts Group
16-bit-- capable of
displaying millions
of colors at once
without dithering.

a compression setting
of about 60% will
result in the optimum
balance of quality and
file size .
PNG — Portable Network Graphics
 ZIP based lossless compression
 Can be transparent (4-channel images)
BMP — Windows Bitmap
 Simple uncompressed
 Can either be indexed or not
 DIB (Device Independent Bitmap) / DDB (Device Dependent
Bitmap)
Image (bitmap) representation
 Image
 Function defined over 2D domain, f(x, y)

f(x, y) f(x, y)
x

lena
y
Image Representation
 Digital Image
 x, y, f(x, y) take only discrete values
 Formed with finite elements
 Each element is called a pixel ( 像素 )
 picture elements
 image elements
 pels
 pixels - most widely used

pixel
Image in Memory
 2D or 3D array

[B,G,R] [B,G,R] … [B,G,R] [B] [B] [B] [B] … [B]


[B,G,R] [B,G,R] … [B,G,R] [B] [B] [B] [B] … [B]
……………………………
[B,G,R] [B,G,R] … [B,G,R] [G] [G] [G] [G] … [G]
……………………………… [G] [G] [G] [G] … [G]
……………………………
[B,G,R] [B,G,R] … [B,G,R]
[R] [R] [R] [R] … [R]
[B,G,R] [B,G,R] … [B,G,R]
[R] [R] [R] [R] … [R]
[B,G,R] [B,G,R] … [B,G,R] ……………………………

交叉存贮 (Interlaced) 顺序存贮 (Sequential)


Image in Memory

 Size (resolution, dpi, width*height, number of pixels)


 Color Space (RGB, CMYK, YUV, Lab, …)
 Channels ( 1 , 2 , 3 , 4 , gray&color )
 Bit Depth ( number of bits for each channel, 8bits, 12bits,
……,LDR & HDR )
 Coordinate system :

x y
(0, 0)

(0, 0)
y x
Left Handed Right Handed
Image in Program

struct MyImage
{
int width, height; // 大小

int type; // 类型,含通道数、位深度信息


/* CV_8UC3 : unsigned char [3]
CV_32SC1 : int [1]
CV_32UC1 : uint [1]
CV_32FC4 : float [4]
*/

void* data; // 图像数据


int step; // 步长(每行所占用的字节数)
};
step ? ( stride, 步长)
 For data-alignment: make each row start from address that are multiple of 4, 8, or 16.

 For representing sub-region (ROI) :

……

data 0  step 0
(data0, W, H, step0)
data1  step 0
(data1, w, h, step0)
struct MyImage
Access Pixels {
int width, height;
 img.type=CV_8UC3 : 8 位无符号, 3 通道数据 int type;

uchar* get_pixel(const MyImage &img, int x, int y) void* data;


{ int step;
???????????? };
}

 img.type=CV_32SC3: 32 位带符号, 3 通道数据

int* get_pixel(const MyImage &img, int x, int y)


{
??????????????
}
Access Pixels

 img.type=CV_8UC3 : 8 位无符号, 3 通道数据

uchar* get_pixel(const MyImage &img, int x, int y)


{
// return (uchar*)img.data+y*img.width*3+x*3;
step != width*nc
return (uchar*)img.data+y*img.step+x*3;
}

 img.type=CV_32SC3: 32 位带符号, 3 通道数据

int* get_pixel(const MyImage &img, int x, int y)


{
// return (int*)( (char*)img.data+y*img.step*4+x*3*4 );
step 始终是字节数
return (int*)( (char*)img.data+y*img.step+x*3*4 );
}
 Scan Pixels
void scan_pixels(uchar *data, int width, int height, int step, int nc)
{
uchar *row=data;
for(int yi=0; yi<height; ++yi, row+=step)
{
uchar *px=row;
for(int xi=0; xi<width; ++xi, px+=nc)
{
// px now address the pixel (xi, yi)
}
}
}

 Scan Pixels in ROI

void scan_roi_pixels(MyImage &img, int x, int y, int roi_width, int roi_height)


{// 通道数 nc=img.nc();
???????????????????????????
}
 Scan Pixels
void scan_pixels(uchar *data, int width, int height, int step, int nc)
{
//…….
}

 Scan Pixels in ROI

void scan_roi_pixels(MyImage &img, int x, int y, int roi_width, int roi_height)


{// 通道数 nc=img.nc();

scan_pixels( get_pixel(img, x, y), roi_width, roi_height, img.step, img.nc() );

}
OpenCV

 CV=Computer Vision
 Created by Intel and maintained by Willow Garage.
 Available for C, C++, and Python
 Cross-platform: Windows, Linux/Mac, Android, iOS
 Open Source and free
 Plenty of features : more than 500 functions for image
processing and computer vision
 Being actively developed and updated
 …
 Google for more
OpenCV API Reference
 Introduction
 core. The Core Functionality
 imgproc. Image Processing
 highgui. High-level GUI and Media I/O
 video. Video Analysis
 calib3d. Camera Calibration and 3D Reconstruction
 features2d. 2D feature detection and matching
 objdetect. Object Detection
 ml. Machine Learning
 flann. Clustering and Search in Multi-Dimensional Spaces
 gpu. GPU-accelerated Computer Vision
 stitching. Images stitching

You might also like