This post is about my work done for CuPy until now, as part of the GSoC program. This post will brief my whole GSoC journey with CuPy.

edit

CuPy is an open-source matrix library accelerated with NVIDIA CUDA. It has a NumPy compatible GPU library for fast computation in Python. CuPy has a wide range of methods for scientific computations similar to that in SciPy. CuPy also supports writing custom CUDA kernels that can be can directly execute on GPUs. CuPy is one of the best libraries to do computation on the GPU without knowledge of CUDA. It provides the function directly executed over the GPU, thus saving computation to a large extend.

About My Project

During my GSoC, I plan to improve the coverage of the SciPy and NumPy functions. I specifically choose to improve the sparse linear algebra coverage and work on the distribution for the Generator. The reason to choose sparse linear algebra is that the algorithm used to solve the linear algebra problem was very complex and required a very large number of computation iterations to reach the end/feasible solution. So implementing it on GPU will significantly decrease the computational time. And the reason to choose random distribution is that I was very fascinated by the statistical distributions and studied them before applying to GSoC.

Impact of my Project

I have added support for the few solvers (planned to do more :/), including LSMR, MINRES, CGS. These solvers were very computationally expensive as solution steps for these methods involves matrix-vector, vector-vector multiplications. Implementation of the algorithm with CuPy has significantly speed up the performance of the algorithms. You can follow my previous blog to see the performance comparison LSMR, CGS MINRES

Similarly, for the random distributions, now CuPy support some more distribution for the Generator, which can be used to generate the random number following those distributions.

(To see the performance comparision of each algorithm you can follow the link to PR given below)

My work for CuPy

Before beginning of the GSoC:

  • Fixed choice function for 0 samples from 0 candidates #4511 [Merged]
  • Add exponential distribution to random API #4915 [Merged]
  • Add cupyx.scipy.sparse.kronsum() #4968 [Merged]

During GSoC

  • Add reshape method for COO, CSR and CSC matrices #5301 [Merged]
  • Add cupyx.scipy.sparse.linalg.lsmr #5331 [Merged]
  • Add cupyx.scipy.sparse.linalg.cgs #5524 [Merged]
  • Add cupyx.scipy.sparse.linalg.minres #5585 [Merged]
  • Add support for array input in beta distribution of Generator #5573 [Merged]
  • Add geometric distribution to new Generator #5443 [Merged]
  • Add hypergeometric distribution to new Generator #5560 [Merged]
  • Add support for array input in beta distribution of Generator #5573 [Merged]
  • Add Log Series distribution to Generator #5618 [Merged]
  • Add Power distribution to Generator #5624 [Merged]
  • Add Chi-square distribution to Generator #5645 [Merged]
  • Add binomial distribution to new Generator #5429 [Open]
  • Add Dirichlet distribution to Generator #5648 [Merged]
  • Add F distribution to Generator #5655 [Merged]

Future commitments

Firstly, I will focus on the work that has remained incomplete during the project and try to complete it as soon as possible and merge it. I would help the new contributors and keep contributing to the library and will help to improve its performance. I will also research more robust techniques and features that can be added to CuPy and make it more powerful.

What I get from CuPy

In these three months, I learned many things, including

  • I learned how real-world software development works
  • How to effectively filter the content necessary for the completing project
  • Dealing with difficult situations, i.e. from no solution to finding the solution to a problem
  • During these months, I learned many tools used by developers to ease their development process to debug and analyse the implementation’s performance.
  • My understanding of the random distribution increases, and I have also become familiar with the random number generators.
  • During the whole GSoC process from preparation to till now, I got more understanding of the NumPy and SciPy, which are among the most commonly used tools in the scientific community.
  • Due to CuPy, I come to about the CUDA used to program the GPU. Till now, I have a basic understanding of it. But in future, I hope to get a better knowledge of it.

Acknowledgements

This GSoC was not possible without support from my mentor Masayuki Takagi and Emcastillo. They helped me a lot, from fixing my mistakes and helping me solve the problem with sparse linear algebra routines to supporting me when I was stuck on the problem. I am also very thankful to the other members of the CuPy community; without them, this project was not have come to this stage. This was my start with open source, and finding such a good community will be very helpful in learning new things and being part of open source.

This project is also not possible without support from my family and friends. They helped me to remain focus on the project and helped me to overcome my side issues.

Finally, I also want to thank you all the developer and researcher out there, whose research gave me the idea of the algorithms through their works and tools they made helped me complete this project.