Deep learning methods for solving non-uniqueness of inverse design in photonics

Devices design is a significant area of research in photonics. As optical components become more highly integrated and functionally complicated, primitive design methods based on intuition and prior knowledge cannot cope with that [1,2]. Now, with the support of high computational power, iterative optimization algorithms, such as genetic algorithms, particle swarm optimization, adjoint-gradient and so on, can efficiently search the design space guided by target responses [[3], [4], [5], [6]]. However, the iterative computation would become time-consuming when the dimension of the design space is increased. Moreover, the iterative algorithm fails to obtain an intrinsic mapping correspondence between the target response and the physical parameters. Thus, we need to repeat many simulation calculations even for similar optical responses. Fortunately, emerging neural networks have the ability to learn complex functions [[7], [8], [9], [10]], and the well-trained network can predict outcomes accurately and rapidly [[11], [12], [13]]. Therefore, the research on deep learning applied to the inverse design is attractive and significant.

It is a general phenomenon that a target response would be achieved by some different physical designs. This issue of non-uniqueness can affect the convergence of training. Fundamentally, the loss function expressed as the mean-squared error would have two or more true values, which confuses the direction of convergence [14,15]. To tackle this obstacle, many novel models have been reported. Liu et al. [16] proposed the TNN that combines the inverse network and the pre-trained forward network. The loss function is changed into the error between input and output responses, so the non-uniqueness in design parameters disappears. Although this approach solves the convergence problem due to non-uniqueness, different feasible solutions degenerate into one that cannot be definite to meet the requirements of practical production. Next, Unni et al. [17] proposed the MDN, and its outputs are the parameters of the mixture probability density function, which describes the possible sampling probabilities for each physical design parameter. The training loss function is the maximum likelihood estimation, which is immune to non-uniqueness. Different feasible design parameters for the same input response can be expressed as output probability density curves containing multiple peaks. Lately, Dai et al. [18] realized the inverse design of color filters utilizing the CGAN. The input is a synthetic vector that contains conditional target responses and normal distribution. The model is composed of a tandem structure and a discriminator that would be trained so that the predicted design parameters have a similar distribution to the true design parameters in dataset. Due to the component of random samples, the extra optimal methods are needed to select the best result among a number of predictions generated from the generator. Besides, the CGAN is more difficult to train and tune hyperparameters. In addition to the above network models, there exist other methods, such as multi-branch networks [19], which may resolve simple non-uniqueness, but they are crippled when dealing with complex and high-dimensional situations.

In this paper, we focus on these effective net models: TNN, MDN and CGAN. We have applied them to the inverse design of multilayer structures. The thickness of each layer is the design parameter, and the transmission spectrum is the output response. We designed different architectures of these models by Pytorch and conducted the simulation of train and test, and we have made the comparison and analysis of their performance. Especially, the MDN could obtain mixture Gaussian distribution about design parameters rather than fixed values, so we proposed an improved technique of sampling every Gaussian distribution with the non-zero weight, which is effective and suitable for our design. As a result, we would rarely miss every potential value related to different optima, and we have gotten different designs with low MSE in our test.

Comments (0)

No login
gif