Following the previous week, I worked on implementing the minres and work on adding few distribution to the Generator.
MINRES solver
It is one of the non-stationary iterative methods to solve the linear equation. MINRES are varient of solver which can be applied to the symmetric but not necessarily positive definite systems. This method avoids LU-factorisation and minimizes the residual in the 2-norm.
Outline of single MINRES iteration/step for (A,b):
- Perform Lanczos step for (A, b).
- Update QR factorization of Tk. Here Tk is the tridiagonal matrix.
- Solve least square-problem y(k)
- update solution x(k) = Vky(k)
In my initial implementation of minres I got good improvement as
size | scipy | cupy |
---|---|---|
100 | 30.786543846130375 | 947.5719909667969 |
500 | 41.63404846191406 | 1094.1433105468752 |
1000 | 48.57564735412598 | 1165.3687133789062 |
5000 | 309.1716766357422 | 1239.9229736328125 |
10000 | 1342.04052734375 | 1101.5445556640627 |
15000 | 3401.540649414063 | 1478.1609497070312 |
But in this implementation I am making unnecessary copies of the vector while solving the equation. After following the suggestion from Masayuki Takagi(my mentor) I removed all the unnecessary copies and after synchronization I got more improvement over my previous implementation.
size | previous implementation | current implementation |
---|---|---|
100 | 947.5719909667969 | 270.47564697265625 |
500 | 1094.1433105468752 | 324.1162109375 |
1000 | 1165.3687133789062 | 336.3395233154297 |
5000 | 1239.9229736328125 | 311.1663513183594 |
10000 | 1101.5445556640627 | 374.30403137207037 |
15000 | 1478.1609497070312 | 521.3742980957032 |
Distribution in new random Generator
During this two weeks I got my improvement for beta distribution merged and along with that I made PR for Log-series, Power and Chi-squared distribution as well and got it merged into repo :rocket:. All this distribution has very huge performance improvement as compared to the Numpy version of same. Will add the performance comparison later in post.