
Comment obtenir les coordonnées du rectangle englobant dans la détection d'objets YOLO?

enter image description here

J'ai besoin d'obtenir les coordonnées de la boîte englobante générées dans l'image ci-dessus à l'aide de la détection d'objets YOLO.


Une solution rapide consiste à modifier le fichier image.c pour imprimer les informations du cadre de sélection:

if(bot > im.h-1) bot = im.h-1;

// Print bounding box values 
printf("Bounding Box: Left=%d, Top=%d, Right=%d, Bottom=%d\n", left, top, right, bot); 
draw_box_width(im, left, top, right, bot, width, red, green, blue);
Brian O'Donnell

Il y a un joli petit python (2 - mais avec de petites modifications 3. [changez simplement l'impression et les chaînes en chaînes binaires dans le main ]) programme que vous pouvez utiliser dans le référentiel principal https://github.com/pjreddie/darknet/blob/master/python/darknet.py

REMARQUE! Les coordonnées données sont le milieu et la largeur et la hauteur.


pour python utilisateur dans Windows:

d'abord ..., faites plusieurs travaux de réglage:

  1. paramètre python chemin de votre dossier darknet dans le chemin de l'environnement:


  2. ajoutez PYTHONPATH à la valeur Path en ajoutant:


  3. modifier le fichier coco.data dans cfg folder, en remplaçant la variable de dossier names par votre coco.names dossier, dans mon cas:

    names = D:/core/darknetAB/data/coco.names

avec ce paramètre, vous pouvez appeler darknet.py (depuis le référentiel alexeyAB\darknet ) en tant que module python à partir de n'importe quel dossier).

démarrer le script:

from darknet import performDetect as scan #calling 'performDetect' function from darknet.py

def detect(str):
    ''' this script if you want only want get the coord '''
    picpath = str
    cfg='D:/core/darknetAB/cfg/yolov3.cfg' #change this if you want use different config
    coco='D:/core/darknetAB/cfg/coco.data' #you can change this too
    data='D:/core/darknetAB/yolov3.weights' #and this, can be change by you
    test = scan(imagePath=picpath, thresh=0.25, configPath=cfg, weightPath=data, metaPath=coco, showImage=False, makeImageOnly=False, initOnly=False) #default format, i prefer only call the result not to produce image to get more performance

    #until here you will get some data in default mode from alexeyAB, as explain in module.
    #try to: help(scan), explain about the result format of process is: [(item_name, convidence_rate (x_center_image, y_center_image, width_size_box, height_size_of_box))], 
    #to change it with generally used form, like PIL/opencv, do like this below (still in detect function that we create):

    newdata = []
    if len(test) >=2:
        for x in test:
            item, confidence_rate, imagedata = x
            x1, y1, w_size, h_size = imagedata
            x_start = round(x1 - (weight_size/2))
            y_start = round(y1 - (height_size/2))
            x_end = round(x_start + w_size)
            y_end = round(y_start + h_size)
            data = (item, confidence_rate, (x_start, y_start, x_end, y_end), w_size, h_size)

    Elif len(test) == 1:
        item, confidence_rate, imagedata = test
        x1, y1, w_size, h_size = imagedata
        x_start = round(x1 - (w_size/2))
        y_start = round(y1 - (h_size/2))
        x_end = round(x_start + w_size)
        y_end = round(y_start + h_size)
        data = (item, confidence_rate, (x_start, y_start, x_end, y_end), w_size, h_size)

        newdata = False

    return newdata

Comment l'utiliser:

table = 'D:/test/image/test1.jpg'
checking = detect(table)'

pour obtenir les coordonnées:

si seulement 1 résultat:

x1, y1, x2, y2 = checking[2]

si plusieurs résultent:

for x in checking:
    item = x[0]
    x1, y1, x2, y2 = x[2]
    print(x1, y1, x2, y2)
Wahyu Bram

Si vous allez l'implémenter dans python, il y a ce petit wrapper python que j'ai créé dans ici . Suivez le fichier ReadMe et installez-le. Ce sera très facile à installer.

Ensuite, suivez ceci exemple de code pour savoir comment détecter des objets.
Si votre détection est det

top_left_x = det.bbox.x
top_left_y = det.bbox.y
width = det.bbox.w
height = det.bbox.h

Si vous en avez besoin, vous pouvez obtenir le point médian en:

mid_x, mid_y = det.bbox.get_point(pyyolo.BBox.Location.MID)

J'espère que cela t'aides..


Inspiré de la réponse @Wahyu ci-dessus. Il y a peu de changements, modifications et corrections de bugs et testés avec la détection d'un seul objet et la détection d'objets multiples.

# calling 'performDetect' function from darknet.py
from darknet import performDetect as scan
import math

def detect(img_path):
    ''' this script if you want only want get the coord '''
    picpath = img_path
    # change this if you want use different config
    cfg = '/home/saggi/Documents/saggi/prabin/darknet/cfg/yolo-obj.cfg'
    coco = '/home/saggi/Documents/saggi/prabin/darknet/obj.data'  # you can change this too
    # and this, can be change by you
    data = '/home/saggi/Documents/saggi/prabin/darknet/backup/yolo-obj_last.weights'
    test = scan(imagePath=picpath, thresh=0.25, configPath=cfg, weightPath=data, metaPath=coco, showImage=False, makeImageOnly=False,
                initOnly=False)  # default format, i prefer only call the result not to produce image to get more performance

    # until here you will get some data in default mode from alexeyAB, as explain in module.
    # try to: help(scan), explain about the result format of process is: [(item_name, convidence_rate (x_center_image, y_center_image, width_size_box, height_size_of_box))],
    # to change it with generally used form, like PIL/opencv, do like this below (still in detect function that we create):

    newdata = []

    # For multiple Detection
    if len(test) >= 2:
        for x in test:
            item, confidence_rate, imagedata = x
            x1, y1, w_size, h_size = imagedata
            x_start = round(x1 - (w_size/2))
            y_start = round(y1 - (h_size/2))
            x_end = round(x_start + w_size)
            y_end = round(y_start + h_size)
            data = (item, confidence_rate,
                    (x_start, y_start, x_end, y_end), (w_size, h_size))

    # For Single Detection
    Elif len(test) == 1:
        item, confidence_rate, imagedata = test[0]
        x1, y1, w_size, h_size = imagedata
        x_start = round(x1 - (w_size/2))
        y_start = round(y1 - (h_size/2))
        x_end = round(x_start + w_size)
        y_end = round(y_start + h_size)
        data = (item, confidence_rate,
                (x_start, y_start, x_end, y_end), (w_size, h_size))

        newdata = False

    return newdata

if __name__ == "__main__":
    # Multiple detection image test
    # table = '/home/saggi/Documents/saggi/prabin/darknet/data/26.jpg'
    # Single detection image test
    table = '/home/saggi/Documents/saggi/prabin/darknet/data/1.jpg'
    detections = detect(table)

    # Multiple detection
    if len(detections) > 1:
        for detection in detections:
            print(' ')
            print(' ')
            print('All Parameter of Detection: ', detection)

            print(' ')
            print(' ')
            print('Detected label: ', detection[0])

            print(' ')
            print(' ')
            print('Detected object Confidence: ', detection[1])

            x1, y1, x2, y2 = detection[2]
            print(' ')
            print(' ')
                'Detected object top left and bottom right cordinates (x1,y1,x2,y2):  x1, y1, x2, y2')
            print('x1: ', x1)
            print('y1: ', y1)
            print('x2: ', x2)
            print('y2: ', y2)

            print(' ')
            print(' ')
            print('Detected object width and height: ', detection[3])
            b_width, b_height = detection[3]
            print('Weidth of bounding box: ', math.ceil(b_width))
            print('Height of bounding box: ', math.ceil(b_height))
            print(' ')

    # Single detection
        print(' ')
        print(' ')
        print('All Parameter of Detection: ', detections)

        print(' ')
        print(' ')
        print('Detected label: ', detections[0][0])

        print(' ')
        print(' ')
        print('Detected object Confidence: ', detections[0][1])

        x1, y1, x2, y2 = detections[0][2]
        print(' ')
        print(' ')
            'Detected object top left and bottom right cordinates (x1,y1,x2,y2):  x1, y1, x2, y2')
        print('x1: ', x1)
        print('y1: ', y1)
        print('x2: ', x2)
        print('y2: ', y2)

        print(' ')
        print(' ')
        print('Detected object width and height: ', detections[0][3])
        b_width, b_height = detections[0][3]
        print('Weidth of bounding box: ', math.ceil(b_width))
        print('Height of bounding box: ', math.ceil(b_height))
        print(' ')

# Single detections output:
# test value  [('movie_name', 0.9223029017448425, (206.79859924316406, 245.4672393798828, 384.83673095703125, 72.8630142211914))]

# Multiple detections output:
# test value  [('movie_name', 0.9225175976753235, (92.47076416015625, 224.9121551513672, 147.2491912841797, 42.063255310058594)),
#  ('movie_name', 0.4900225102901459, (90.5261459350586, 12.4061279296875, 182.5990447998047, 21.261077880859375))]
Saugat Bhattarai