Section 1.2 Python Packages and NumPy
Python packages and libraries are collections of functions and modules centered around a common theme. In this section we will learn how to import a Python package and use functions from that package.Objectives
Subsection Importing the package NumPy
NumPy is an open source scientific computing package that allows you to use standard mathematical functions and constants like sine or pi. NumPy also allows you to work with arrays of numbers so you can efficiently perform computations.
Note to call a function from an imported package you must use the syntax package.function. To avoid retyping the long name of a package every time, it is standard practice to rename the package as you import.
You Try 1.4.
Rename the package numpy in the above code by replacing the import line with import numpy as np. Next use the new name of the package to call the function np.sin() and re-run.
Subsection Standard functions in NumPy
In addition to the standard trigonometric functions, NumPy also contains \(e^x\) as exp() and natural log as log(), as well as the standard mathematical constants like pi and e.
You Try 1.5.
Use the NumPy package to compute the area of a circle using pi.
Hint.
If you import numpy as np, you will use np.pi
Subsection NumPy arrays
NumPy arrays can help us efficiently do computations with a collection of numbers all at once. We can also represent vectors or matrices using NumPy arrays. Run the code below that calls the NumPy function array to see how it works.
NumPy also has built in functions to automatically create an array with certain conditions. For example linspace() takes a given interval and sets up an array of evenly spaced numbers on that interval.
Note that the array includes 5, so it takes 11 evenly spaced numbers to get numbers 0.5 apart. Note also that although the number values are evenly spaced, the printed array is not. Namely the default is to space out the entries by the maximum number of decimal places with blank spaces instead of extra zeroes.
You Try 1.6.
Edit the above code to use linspace to create an array of numbers between 0 and 5 that are one-quarter apart. When you evaluate your code should print the list 0, 0.25, 0.5, 0.75, etc.
Another way to create an array of values for a given interval is arange(), which uses a step value to set up a sequence of numbers within that interval.
Note that the right endpoint of the interval in arange is always excluded (strictly less than).
You Try 1.7.
Edit the above code to use arange to create an array of numbers between 0 and 5 that are one-quarter apart. When you evaluate your code should print the list 0, 0.25, 0.5, 0.75, etc.
Subsection Selecting and using part of an array
We can access an entry of a one-dimensional array using it’s index with array_name[index number], where the index numbers the positions of the elements in the array starting with the number 0.
You Try 1.8.
Edit the code above to access and print the 4th entry, the 100th entry, and the last entry in the array X.
Note: The last element of the array can also be accessed using index -1.
For higher dimensional arrays, you need more than one index to access a single entry array_name[index 1, index 2, index 3]. One index will access a sub-array corresponding to that index.
Slicing selects a range of entries in an array by placing a colon between the desired start and stop index values. Note the entry at the stop index is not included.
You Try 1.9.
Edit the above code to create a new array out of the middle 50 values of the given array.
Subsection Importing .csv files using the Pandas package
Sometimes we want to import an array from a file or spreadsheet. The Python package, Pandas, was created to manipulate spreadsheet data. While the main object in NumPy is a NumPy array, the main object in the Pandas package is a Pandas dataframe, essentially a table with indexed rows and named columns. More information about dataframe (table) manipulation in Pandas will be added to the appendix in future versions of this book.
For now we will just use the Pandas package to import a .csv file as a Pandas dataframe, then convert it into a NumPy array so we can use the computational power of NumPy. Since we don’t have a simple way to upload files into the online textbook, the following code must be used in Google Colab, with the file uploaded temporarily to Colab, or on your local machine with a local version of Python and the appropriate filepath.
Warning: read_csv defaults to reading in a table with headers (column names). Otherwise use pd.read_csv('filepath.csv', header = None)
Summary.
You can import a Python package and rename it by using import package as name.
You can call a function from a package using packagename.function
Numpy arrays can help us efficiently do computations.
The Numpy functions linspace() and arange() set up an array of values over an interval.
Indexing and slicing a numpy array can be used to access a single entry or a range of entries.
You can use the Pandas function read_csv to import a .csv file as a Pandas dataframe then convert it into a NumPy array using array.to_numpy().
Exercises Exercises
1.
Research NumPy
Look up how to efficiently create a 15 element array of all \(1\)’s.
Use this to create a 15 element array of all \(100\)’s.
Solution.
There is more than one way to do this. One approach is to use the numpy function ones().
Once you have an array of all 1’s, you can multiply it by any number you want.
2.
The magnitude of an earthquake is measured using the Richter Scale, which is a logarithmic scale. The amount of energy \(E\) in ergs an earthquake releases can be determined from the magnitude \(M\) using the formula
\begin{equation*}
\log_{10}(E)=11.8+1.5M
\end{equation*}
The
USGS website lets you download magnitude data for the 30 most recent earthquakes in the world with magnitude
\(>2.5\text{.}\)
Use Numpy to determine how much energy has been released in the 30 most recent earthquakes. Note that you can copy and paste data from a csv file into SageMathCell. You will just need to then format that data for Numpy.