реализация обнаружения объектов, таких как openCV

Я пытаюсь реализовать алгоритм Виолы-Джонса для обнаружения объектов с использованием каскадов Хаара (например, реализация openCV) в C для обнаружения лиц. Я пишу код C совместимым с Vivado HLS способом, поэтому я могу перенести реализацию на FPGA. Моя главная цель — узнать как можно больше, а не просто заставить это работать. Я также был бы признателен за любую помощь в улучшении моего вопроса.

В основном я начал читать Г. Брадски Изучение openCV, посмотрел несколько онлайн-учебников и начал писать код. Конечно же, он не обнаруживает лица, и я не знаю, почему. На данный момент меня больше волнует понимание своих ошибок, а не способность распознавать лица.

Мои шаги по внедрению

Я не уверен, насколько подробно это уместно, но вкратце:

Извлечение данных каскада Хаара из haarcascade_frontalface_default.xml в C читаемые структуры (огромные массивы)
Написание функции для создания интегрального изображения любого заданного 8-битного изображения в оттенках серого размером 24x24 (того же размера, что указан в каскаде).
Применение знаний из этого отличного поста для выполнения необходимых расчетов.

Моя схема тестирования

Реализуя скрипт python для обнаружения лиц с использованием библиотеки openCV с тем же каскадом Хаара, что и упомянутый выше, для создания золотых данных, обнаруженное лицо вырезается (обеспечивая размер 24x24) из изображения и сохраняется.
Сохраненные изображения преобразуются в одномерные массивы C, содержащие значения пикселей построчно: img = {row0col0, row0col1, row1col0, row1col1, ... }
рассчитывается интегральное изображение и применяется распознавание лиц

Результат

Лица проходят только 6 из 25 стадий каскада Хаара и поэтому не обнаруживаются моей реализацией, где я знаю, что они должны были быть обнаружены, поскольку скрипт на питоне с openCV и тем же каскадом Хаара действительно их обнаружил.

Мой код

 /*
 * This is detectFace.c
 */

#include <stdio.h>
#include "detectFace.h"

// define constants based on Haar cascade in use
// Each feature is made of max 3 rects
//#define FEAT_NO 1     // max no. of features (= 2912 for face_default.xml)
#define RECTS_IN_FEAT 3 // max no. of rect's per feature
//#define INTS_IN_RECT 5    // no. of int's needed to describe a rect
// each node has one feature (bijective relation) and three doubles
#define STAGE_NO 25 // no. of stages
#define NODE_NO 211 // no of nodes per stage, corresponds to FEAT_NO since each Node has always one feature in haarcascade_frontalface_default.xml
//#define ELMNT_IN_NODE 3   // no. of doubles needed to describe a node

// constants for frame size
#define WIN_WIDTH 24 // width = height =24

//int detectFace(int features[FEAT_NO][RECTS_IN_FEAT][INTS_IN_RECT], double stages[STAGE_NO][NODE_NO][ELMNT_IN_NODE], double stageThresh[STAGE_NO], int ii[24][24]){
int detectFace(
    int ii[576],
    int stageNum,
    int stageOrga[25],
    float stageThresholds[25],
    float nodes[8739],
    int featOrga[2913],
    int rectangles[6383][5])
{
    int passedStages = 0; // number of stages passed in this run
    int faceDetected = 0; // turns to 1 if face is detected and to 0 if its not detected
    // Debug:
    int nodesUsed = 0; // number of floats out of nodes[] processed, use to skip to the unprocessed floats
    int rectsUsed = 0; // number of rects processed
    int droppedInStage0 = 0;

    // loop through all stages
    int i;
detectFace_label1:
    for (i = 0; i < STAGE_NO; i++)
    {
        double tmp = 0.0;           //variable to accumulate node-values, to then compare to stage threshold
        int nodeNum = stageOrga[i]; // get number of nodes for this stage from stageOrga using stage index i
        // loop through nodes inside each stage
        // NOTE: it is assumed that each node maps to one corresponding feature. Ex: node[0] has feat[0) and node[1] has feat[1]
        // because this is how it is written in the haarcascade_frontalface_default.xml
        int j;
    detectFace_label0:
        for (j = 0; j < NODE_NO; j++)
        {
            // a node is defined by 3 values:
            double nodeThresh = nodes[nodesUsed]; // the first value is the node threshold
            double lValue = nodes[nodesUsed + 1]; // the second value is the left value
            double rValue = nodes[nodesUsed + 2]; // the third value is the right value
            int sum = 0;                          // contains the weighted value of rectangles in one Haar feature
            // loop through rect's in a feature, some have 2 and some have 3 rect's.
            // Each node always refers to one feature in a way that node0 maps to feature0 and node1 to feature1 (The XML file is build like that)
            //int rectNum = featOrga[j]; // get number of rects for current feature using current node index j
            int k;
        detectFace_label2:
            for (k = 0; k < RECTS_IN_FEAT; k++)
            {
                int x = 0, y = 0, width = 0, height = 0, weight = 0, coordUpL = 0, coordUpR = 0, coordDownL = 0, coordDownR = 0;

                // a rect is defined by 5 values:
                x = rectangles[rectsUsed][0];      // the first value is the x coordinate of the top left corner pixel
                y = rectangles[rectsUsed][1];      // the second value is the y coordinate of the top left corner pixel
                width = rectangles[rectsUsed][2];  // the third value is the width of the current rectangle
                height = rectangles[rectsUsed][3]; // the fourth value is the height of this rectangle
                weight = rectangles[rectsUsed][4]; // the fifth value is the weight of this rectangle

                // calculating 1-Dim index for points of interest. Formula: index = width * row + column, assuming values are stored in row order
                coordUpL = ((WIN_WIDTH * y) - WIN_WIDTH) + (x - 1);
                coordUpR = coordUpL + width;
                coordDownL = coordUpL + (height * WIN_WIDTH);
                coordDownR = coordDownL + width;

                // calculate the area sum according to Viola-Jones
                //sum += (ii[x][y] + ii[x+width][y+height] - ii[x][y+height] - ii[x+width][y]) * weight;
                sum += (ii[coordUpL] + ii[coordDownR] - ii[coordUpR] - ii[coordDownL]) * weight;
                // Debug: counting the number of actual rectangles used
                rectsUsed++; //
            }
            // decide whether the result of the feature calculation reaches the node threshold
            if (sum < nodeThresh)
            {
                tmp += lValue; // add left value to tmp if node threshold was not reached
            }
            else
            {
                tmp += rValue; // // add right value to tmp if node threshold was reached
            }
            nodesUsed = nodesUsed + 3; // one node is processed, increase nodesUsed by number of floats needed to represent a node (3)¬
        }
        //########  at this point we went through each node in the current stage #######
        // check if threshold of current stage was reached
        if (tmp < stageThresholds[i])
        {
            faceDetected = 0; // if any stage threshold is not reached the operation is done and no face is present
            // Debug: show in which stage the frame was dropped
            printf("Face detection failed in stage %d \n", i);
            //i = stageNum;         // breaks out this loop, because i is supposed to stay smaller than STAGE_NO
        }
        else
        {
            passedStages++; // stage threshold is reached, therefore passedStages will count up
        }
    }
    //########  at this point we went through all stages ###############################
    //----------------------------------------------------------------------------------
    // if the number of passed stages reaches the total number of stages, a face is detected
    if (passedStages == stageNum)
    {
        faceDetected = 1; // one symbolizes that the input is a face
    }
    else
    {
        faceDetected = 0; // zero symbolizes that the input is not a face
    };
    return faceDetected;
}

c opencv haar-classifier vivado-hls

Human 13.09.2017 источник

реализация обнаружения объектов, таких как openCV

Мои шаги по внедрению

Моя схема тестирования

Результат

Мой код

Похожие вопросы