The method for quantified analysis and pattern visualization for eye blinking using high-frame-rate video

The proposed analysis method not only calculates various parameters that have been measured in previous studies analyzing eye blinks using bio-signals, but also attempts to quantitatively analyze eyelid movements during blinkings which were difficult to measure using bio-signals.

One approach to locate the eyes and eyelids with high accuracy is to use pattern recognition or deep learning methods, but these methods require a lot of computational time. Recently, studies have been published analyzing eye blinking in 30 fps videos using deep learning [18]. Nevertheless, it is still challenging to process high-frame-rate videos exceeding 200 fps using deep learning, so the eye blinking datasets used in research are generally short videos captured at 30 fps [19]. This situation makes long-term continuous analysis difficult and limits the expansion of application areas.

In this study, we proposed an analysis method suitable for high-frame-rate video analysis that quantitatively evaluates eyelid movements with significantly fewer computations than deep learning approaches.

This experiment was performed with approval and supervision by the institutional review board (IRB) of Seoul National University Hospital (IRB approval number: 1810-112-982), and the subject gave informed consent to their inclusion in the study as required.

2.1 High-frame-rate videos for eye blink analysis

For precise eye blink analysis, the faster the frame rate of the video, the better. However, an excessively fast frame rate acquiring overly more information than necessary requires a lot of computation time for analysis. In the present study, 240 fps videos were mainly used, and this frame rate was sufficient for eye blink analysis because it was similar to the sampling frequency mainly used in EOG analysis [10]. Also, the frame rate was twice the illumination flickering frequency and four times the power frequency at the place where the experiment was conducted. By using a frequency that was a multiple of the illumination flicker frequency, the flicker effect can be easily removed, making image processing and analysis easier.

The videos used for eye blink analysis were acquired using a mobile phone (SM-G965X S9+, Samsung Electronics, Korea) and a low-cost high-frame-rate camera (EX-ZR200, Casio, Japan). Both cameras are able to acquire 240 fps videos for an extended period, but their resolutions differ. The resolution of mobile phone was full HD (1920 × 1080 pixels), and the resolution of camera was 512 × 384 pixels.

In this experiment, additional equipment that could affect eye blinking were excluded as far as possible. Experimental equipment setup for high-frame-rate videos consists simply of a camera and a monitor, and the camera is mounted above the monitor. While an examinee views a video clip on the monitor, the camera captures video of the examinee’s face and the surrounding area. This experimental setup, consisting of only simple equipment, can be easily applied to a variety of environments, such as driving, working on a computer, or watching TV.

2.2 Extraction of eye blinking from the video

The high-frame-rate video clips include a lot of information, so the process of extracting eye blinking sequence from the entire video clip should be preceded for efficient analysis. To extract the eye blink sequence, the color video frames were first converted into gray-scale images. The average pixel intensity calculated from the grayscale images varies continuously during blinking due to the color differences among the skin, eyeball, and pupil. This intensity profile exhibits a periodic change corresponding to each eye blink.

Event signals were additionally generated from gray-scale images. In order to simulate the event signals, each pixel stored a reference brightness level, and continuously compared it to the current brightness level. If the difference in brightness exceeded a threshold, that pixel reset its reference level and generated an event. The amount of dynamic vision events changes twice for each eye blinking (Fig. 1b). The first change in the signal of dynamic vision events represents the eye closing process and the other represents the eye opening process. In general, eye closing movements are more intense and shorter than eye opening movements, so the intensity profiles and dynamic event signals are asymmetric.

Using these characteristics, eye blinking in the entire video can be estimated by finding the value of “x1 ~ x3” and calculating “y2 − y1” and “y2 − y3”. If the estimated blink duration, “x3 − x1”, is shorter than 50ms or longer than 1500 ms, it is considered a non-blinking behavior, such as flutter or micro-sleep, and is excluded from the analysis process. Also, if the difference between the values ​​of “y2 − y1” and “y2 − y3” is large (> 20%), it is assumed that the signal change is due to a change in the environment, and is also excluded from the analysis. Then, all frames corresponding to each eye blink are extracted from the entire video, and each blink is saved as an individual video clip. Basic parameters such as frequency and duration can be calculated through this extraction task.

This process allows the calculation of blinking rate and duration which have been used as the main indicators of blink analysis in most previous studies on blink analysis regardless of the measurement method [9-13].

Fig. 1figure 1

The ideal graph changes during the eye blinking and the result of extracting eye blinking sequences. a The ideal signal of gray-scale intensity graph during the eye blinking, b The ideal signal of the number of dynamic vision events during the eye blinking, c The gray-scale intensity graph during eye blinking in real video clip and extraction results, d The number of dynamic vision events graph during eye blinking in real video clip and extraction results

2.3 Visualization of eye blinking pattern

The eye blinking rate and duration of eye blinking do not provide detailed information about the pattern of each eye blinking, but only indicate the overall tendency to eye blinking. For a more detailed analysis, this study aimed to precisely analyze eyelid movements during blinking by finding the position and shape of the eyelids in extracted eye blink video clips, to divide the blink sequence according to eyelid movement, and to classify the blink patterns. Furthermore a method was proposed to visualize the pattern of eye blinking as a single image so that the movement of the eyelids can be easily recognized.

2.3.1 Evaluation of the shape and position of the upper

To visualize eye blinking patterns, the position and shape of the eyelid must be estimated from all frames during each blinking. To estimate the eyelid, the first procedure is to extract the region of interest (ROI) around the eye from the full frame. To achieve this, all the color frame images of the extracted eye blinking video clip are converted to binary images by a brightness threshold, which is the local minimum point in the histogram of the first frame image. Then, the differences between consecutive frames are calculated to generate differential images, and data from the generated differential images are accumulated into a single dynamic vision image, as shown in Fig. 2a. A dynamic visual image represents the frequency with which pixels change during a blinking, with brighter pixel indicating more frequent changes. The location of a rectangle surrounding the brightest blob in the dynamic image is designated as the ROI. The brightest blob represents the area that that changes most frequently during blinking, i.e., the eye.

From the image cropped at the ROI location in the full frame, the upper eyelid is roughly estimated by finding the upper black pixel of each column. The roughly estimated result may include outlier points that are not eyelids. Therefore, it is necessary to remove outlier points in order to obtain an accurate detection result.

An iterative outlier removal algorithm, which is a modified version of the Random Sample Consensus (RANSAC) algorithm, is used to remove outlier points. The RANSAC is an iterative method to estimate actual model parameters by removing outliers using repeated random sampling from a set of observed data that contains outliers [20, 21]. The basic RANSAC algorithm has the disadvantage of long computation time due to the iterative operations [22]. In order to reduce computation time, instead of performing random sampling from the beginning, we use a method of iteratively removing outlier points whose positions differ significantly from the adjacent pixels by applying a polynomial curve-fitting algorithm for every pixel.

After discarding outliers through iterative operations, the upper eyelid is estimated by interpolating the eyelid position of the empty column from the nearby pixels. Table 1 describes the removal algorithm [23] and the process and result of each step are shown in Fig. 2.

Table 1 Iterative outlier removal algorithm for eyelid estimationFig. 2figure 2

Process of evaluation of the shape and position of the upper eyelid. a The event data accumulation image during the eye blinking, b Cropped ROI image, c Binarization of (b), d Estimated upper edge with outlier, e Result of polynomial curve fitting without outlier, f Result image

2.3.2 Visualization graph of eye blinking

It is very difficult to compare and analyze all the positions and shapes of the eyelids evaluated from each frame. Therefore, a method of recognizing all changes of the eyelids during eye blinking is necessary for easier and more detailed analysis of eye blinking patterns. In this study, a method for visualization of eye blinking patterns is proposed, which is to plot a 3-dimensional graph representing eyelid changes during the eye blinking and to make a single image by projecting the graph.

To generate a matrix for visualization, the horizontal (x-axis) domain size of evaluated eyelid data from each frame is made constant. At this time, the domain size is determined as the mode value of the pixel lengths of the eyelids estimated in each frame. If the estimated eyelid length in frame is longer than the domain length, some data on both sides are removed. Conversely, if the estimated eyelid length is shorter than the domain length, the empty data spaces are estimated using a polynomial curve equation. Then, the resized data are recorded in one matrix. For easy comparison between each eye blinking, the horizontal length of the all matrices is resized to 500 pixels, and the vertical value is normalized so that the y-axis displacement of the first frame is from 0.2 to 0.7.

The normalized matrix is plotted as a three-dimensional graph image represented using a color map of jet palette. The X-axis of the graph shows the change over time, and the YZ plane represents the shape of the upper eyelid. The visualization graph shows the changes in the shape of the evaluated upper eyelids over time during eye blinking. The eyelid changes during the blinking can be more easily recognized using a projection image onto a two-dimensional XY plane. Red dotted lines are added to the projection image at 0.1 s intervals to make it easier to compare the duration between different blinking images. The pattern of single eye blinking can be easily analyzed according to color changes of the graph (Fig. 3).

Fig. 3figure 3

Visualization graph of an eye blinking a three-dimensional graph b projection image

2.3.3 Eyelid displacement graph

The intensity profile shown in Fig. 1 does not accurately represent the displacement of eyelid movement. To accurately analyze eyelid movements, a displacement graph of the eyelid was generated by collecting the center position data extracted from the matrix created for the visualization. According to the change of the displacement graph, one eye blinking cycle is divided into three phases: ‘Closing phase’, ‘Complete Closed phase’, and ‘Opening phase’[24]. The ‘Complete Closed phase’ is defined as the section of corresponding to the lower 10% (less than 0.25) of the initial eyelid displacement. The front section of ‘Complete Closed phase’ is defined as ‘Closing phase’, and the subsequent section is defined as ‘Opening phase’. However, if the minimum displacement value of the eyelid position graph is greater than 0.3, the eye blinking is considered to have no ‘Complete Closed phase’. This case is defined as ‘Incomplete blinking’, which means that the eyes are not completely closed while blinking. The incomplete blinking has been cited as one of the reasons for dry eyes, so there have been studies to analyze it [13]. However, it has been very difficult to measure the eyelid position from the signal graph representing changes in bio-signal or pixel intensities, so it has been impossible to accurately define and analyze this blinking pattern.

The various parameters regarding eye blinking can be calculated by the displacement graph; and calculable parameters included blinking duration, closed time, speed of closing and opening, ratio of complete closed phase, total displacement of upper eyelid (Fig. 4).

Fig. 4figure 4

Displacement of eyelid position and selected ‘Complete Closed phase’

Comments (0)

No login
gif