# Basics of CuPy¶

In this section, you will learn about the following things:

- Basics of
`cupy.ndarray`

- The concept of
*current device* - host-device and device-device array transfer

## Basics of cupy.ndarray¶

CuPy is a GPU array backend that implements a subset of NumPy interface. In the following code, cp is an abbreviation of cupy, as np is numpy as is customarily done:

```
>>> import numpy as np
>>> import cupy as cp
```

The `cupy.ndarray`

class is in its core, which is a compatible GPU alternative of `numpy.ndarray`

.

```
>>> x_gpu = cp.array([1, 2, 3])
```

`x_gpu`

in the above example is an instance of `cupy.ndarray`

.
You can see its creation of identical to `NumPy`

‘s one, except that `numpy`

is replaced with `cupy`

.
The main difference of `cupy.ndarray`

from `numpy.ndarray`

is that the content is allocated on the device memory.
Its data is allocated on the *current device*, which will be explained later.

Most of the array manipulations are also done in the way similar to NumPy. Take the Euclidean norm (a.k.a L2 norm) for example. NumPy has numpy.lina.g.norm to calculate it on CPU.

```
>>> x_cpu = np.array([1, 2, 3])
>>> l2_cpu = np.linalg.norm(x_cpu)
```

We can calculate it on GPU with CuPy in a similar way:

```
>>> x_gpu = cp.array([1, 2, 3])
>>> l2_gpu = cp.linalg.norm(x_gpu)
```

CuPy implements many functions on `cupy.ndarray`

objects.
See the reference for the supported subset of NumPy API.
Understanding NumPy might help utilizing most features of CuPy.
So, we recommend you to read the NumPy documentation.

## Current Device¶

CuPy has a concept of the *current device*, which is the default device on which
the allocation, manipulation, calculation etc. of arrays are taken place.
Suppose the ID of current device is 0.
The following code allocates array contents on GPU 0.

```
>>> x_on_gpu0 = cp.array([1, 2, 3, 4, 5])
```

The current device can be changed by `cupy.cuda.Device.use()`

as follows:

```
>>> x_on_gpu0 = cp.array([1, 2, 3, 4, 5])
>>> cp.cuda.Device(1).use()
>>> x_on_gpu1 = cp.array([1, 2, 3, 4, 5])
```

If you switch the current GPU temporarily, *with* statement comes in handy.

```
>>> with cp.cuda.Device(1):
... x_on_gpu1 = cp.array([1, 2, 3, 4, 5])
>>> x_on_gpu0 = cp.array([1, 2, 3, 4, 5])
```

Most operations of CuPy is done on the current device. Be careful that if processing of an array on a non-current device will cause an error:

```
>>> with cp.cuda.Device(0):
... x_on_gpu0 = cp.array([1, 2, 3, 4, 5])
>>> with cp.cuda.Device(1):
... x_on_gpu0 * 2 # raises error
Traceback (most recent call last):
...
ValueError: Array device must be same as the current device: array device = 0 while current = 1
```

`cupy.ndarray.device`

attribute indicates the device on which the array is allocated.

```
>>> with cp.cuda.Device(1):
... x = cp.array([1, 2, 3, 4, 5])
>>> x.device
<CUDA Device 1>
```

Note

If the environment has only one device, such explicit device switching is not needed.

## Data Transfer¶

# Move arrays to a device¶

`cupy.asarray()`

can be used to move a `numpy.ndarray`

, a list, or any object
that can be passed to `numpy.array()`

to the current device:

```
>>> x_cpu = np.array([1, 2, 3])
>>> x_gpu = cp.asarray(x_cpu) # move the data to the current device.
```

`cupy.asarray()`

can accept `cupy.ndarray`

, which means we can
transfer the array between devices with this function.

```
>>> with cp.cuda.Device(0):
... x_gpu_0 = cp.ndarray([1, 2, 3]) # create an array in GPU 0
>>> with cp.cuda.Device(1):
... x_gpu_1 = cp.asarray(x_gpu_0) # move the array to GPU 1
```

Note

`cupy.asarray()`

does not copy the input array if possible.
So, if you put an array of the current device, it returns the input object itself.

If we do copy the array in this situation, you can use `cupy.array()`

with copy=True.
Actually `cupy.asarray()`

is equivalent to cupy.array(arr, dtype, copy=False).

# Move array from a device to the host¶

Moving a device array to the host can be done by `cupy.asnumpy()`

as follows:

```
>>> x_gpu = cp.array([1, 2, 3]) # create an array in the current device
>>> x_cpu = cp.asnumpy(x_gpu) # move the array to the host.
```

We can also use `cupy.ndarray.get()`

:

```
>>> x_cpu = x_gpu.get()
```

Note

If you work with Chainer, you can also use `to_cpu()`

and
`to_gpu()`

to move arrays back and forth between
a device and a host, or between different devices.
Note that `to_gpu()`

has `device`

option to specify
the device which arrays are transferred.

## How to write CPU/GPU agnostic code¶

The compatibility of CuPy with NumPy enables us to write CPU/GPU generic code.
It can be made easy by the `cupy.get_array_module()`

function.
This function returns the `numpy`

or `cupy`

module based on arguments.
A CPU/GPU generic function is defined using it like follows:

```
>>> # Stable implementation of log(1 + exp(x))
>>> def softplus(x):
... xp = cp.get_array_module(x)
... return xp.maximum(0, x) + xp.log1p(xp.exp(-abs(x)))
```