laitimes

What is the NumPy library and how to use it?

The purpose of NumPy is to work with arrays as well as linear algebra, Fourier transforms, and matrices.

译自 What Is the NumPy Python Library and How Do You Use It?,作者 Jack Wallen。

NumPy, which stands for Numerical Python, is an open-source library that has become an invaluable tool in science and engineering. If you need to work with numeric data in Python, NumPy should be your go-to library.

The purpose of NumPy is to deal with arrays as well as linear algebra, Fourier transforms, and matrices. But why use NumPy when Python already has a list that can be used as an array? To put it simply, it's speed. Lists can be slow, especially when working with large lists of data (which is very common in scientific use cases).

Hence the NumPy.

NumPy is 50 times faster than Python lists because it stores arrays in contiguous chunks of memory, which means that processes are able to access (and manipulate) this information very quickly. On top of that, NumPy is optimized to work in tandem with modern CPUs, so it benefits not only from memory placement, but also from the speed of multi-core/threaded CPUs.

Don't think that NumPy is only useful for scientific data, as it can also be used for multidimensional containers for general-purpose data. You can even define arbitrary data types so that it can integrate with a variety of databases.

Now that you understand the concept of NumPy, let's see how it's used.

What you'll need

The only thing you need is an operating system with Python and Pip installed. If you don't have Pip installed, don't worry, I'll show you how. I'll be demonstrating on Ubuntu Linux, so if you're using a different operating system, you'll need to change the Pip install command. Once you install Pip, everything else should be fairly generic.

Install Pip

Installing Pip is actually quite simple. Log in to your machine and open a terminal window. In the terminal window, issue the following command:

sudo apt-get install python3-pip -y

For operating systems that use the DNF package manager, the command would be:

sudo dnf install python3-pip -y

Now that Pip is installed, it's time to add NumPy.

You can't use NumPy until it's installed. To install it, you need to use Pip, which looks like this:

pip install numpy

If you find that you can't install NumPy with Pip (which is the case in Ubuntu 24.04), there is another way and it shouldn't disappoint you. To do this, we go back to the default package manager like this:

sudo apt-get install python3-numpy -y

Note that installing NumPy on a Fedora-based Linux distribution works fine using the pip install numpy command.

Either way, you should be able to install NumPy using either of the above commands.

Use NumPy

Let's see how NumPy is used. We first have to import the NumPy library so that our application can use it. This is done by:

What we did above was import NumPy and give it an alias (np). Next, let's create an array and assign it to arr as follows:

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])            

As you can see, we're using NumPy's array function here.

Finally, we print our array with the following command:

Create a new file with the following command:

Paste the entire block of code into that file and it looks like this:

import numpy as np
 
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
 
print(arr)           

If you run the above code (using the command python3 nu_array.py), the output will be:

It's very simple. Let's make it a little more complicated with the help of the copy parameter.

copy parameter

Sometimes you may want to copy an array. When you do this, you will use the copy parameter. Let's say you have the following array:

If you were to use the copy parameter, it would create an exact copy of the array. This may seem overly simplistic, but copy is a very important function because you always want to make sure that you copy these arrays in the best possible way.

With the copy parameter, there is one primary parameter and two optional parameters, which are:

  • original_array – This is the main parameter that defines the original array to be copied.
  • order – This is one of the optional parameters that controls the order in which the values in the array are copied.
  • subok – This is another optional parameter that defines whether or not to copy any subclasses to the output array.

Let's use copy. I'm going to throw you some difficult questions here.

First, we'll import NumPy with the following command:

Next, we create a NumPy array using the start and stop parameters (which define the start and end positions of the array) and arrange the array into 2 rows and 3 columns (using reshape). Our array looks like this:

my_array = np.arange(start = 1, stop = 7).reshape(2,3)           

It is important to note that the number of objects in the array is defined using reshape. For example, if you use start =1, stop = 10, and reshape(2,3), you will get the following error:

ValueError: cannot reshape array of size 9 into shape (2,3)           

Why? Because 2 rows and 3 columns equals 6 objects. If you change the shape to (3,3), you can use start=1, stop=10. It's all math problems.

Let's print the array with the following code (so we know what it looks like at the moment):

So far, our entire application looks like this:

import numpy as np
my_array = np.arange(start = 1, stop = 7).reshape(2,3)
print(my_array)           

The output above would be:

[[1 2 3]
 [4 5 6]]
           

Now, let's create a copy of the array with the following code:

copy_array = np.copy(my_array)           

Our entire code looks like this:

import numpy as np
my_array = np.arange(start = 1, stop = 7).reshape(2,3)
copy_array = np.copy(my_array)
print(my_array)
print(copy_array)           

The output will be:

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
           

The reason we use copy is that if we just use code like copied_array = my_array, if we change the value in the original array after the copy array has been defined, the value in the copy array will change as well.

Consider the following scenario:

import numpy as np
 
my_array = np.arange(start = 1, stop = 7).reshape(2,3)
bad_copy = my_array
copy_array = np.copy(my_array)
 
print(my_array)
print(copy_array)
 
my_array[-1,-1] = 100
print(my_array)
print(bad_copy)           

If you run the code above, both arrays will print as:

[[ 1  2  3]
 [ 4  5 100]]
[[ 1  2  3]
 [ 4  5 100]]
           

The wrong copy shouldn't have changed. That's why we use copy.

That's how to get started with NumPy. Next time we'll dive into that because there are more tricks to NumPy.