Menu Close

REU Student Project Abstract

Diandre White Headshot
Diandre White
Faculty Mentor(s): Dr. Angela Green-Miller and Dr. Vikram Adve
Using Image Segmentation to Develop a Dataset for Building Computer Vision Models
The animal welfare in the pig production industry could be improved with automated detection tools on commercial farms. One step in developing a computer vision tool for behavior recognition includes tracking pigs automatically. Behavior is traditionally recorded manually, in which a researcher would observe recordings or images of the animals and label behaviors. The long-term goal of this research is to create a technology that can successfully monitor & label behavior of pigs. For this project, a dataset of approximately 800 images of group-housed, weaned piglets were selected for image segmentation. Segmentation is the partitioning of a specific image which can then be used to train a learning model. Using a tool called ‘Robots,’ the contours of each pig were drawn each animal also had to be consistently labeled by number (1, 2, 3…), to ensure that individuals could be easily identified between images. If a pig was partially blocked, the visible portion of the animal was labeled, avoiding the object. If a pig is blocked by an object/animal in such a way that it can’t be avoided, straight lines were drawn to connect visible portions of the animal. Together, the annotated images composed the high-quality labeled ground truth dataset. This dataset can then build & train a computer vision model, which is a tool powered by artificial intelligence that enables computers to obtain specific information from digital images or video. The use of a high-quality labeled dataset is essential because the accuracy of the images directly translates to the accuracy of the model since the images are used to train the model. Developing the segmentation dataset further would allow labels to be included with the corresponding behavior shown. While the computer vision model is very helpful, arguably its greatest pitfall would be its potential to introduce bias, since a model being trained by one researcher gives the model only one perspective of behavior shown. For future research, a few improvements have been proposed, including applying CNN (convolutional neural networks) to extract spatial features & LSTM (long short-term memory networks) which would extract temporal features. Utilizing both would allow spatial & temporal information to be learned simultaneously, allowing for human activity to be recognized.