iii. Boilerplate code review

boiler

Below are just a few examples of basic Python programming to accomplish data saving and importing tasks.

Variable assignment

In Python, data are saved in variables.

Variable names should be simple and descriptive.

Assign a variable by typing its name to the left of the equals sign. Whatever is written to the right of the equals sign will be saved in the variable.

You could read this as “x is defined as four”, “five is assigned to y”, or “z is six”.

# define one variable
x = 1
# assign multiple variables
x = 1
y = 2
z = 3

Use print() to show it on the screen

print(x)
1
# call the variables directly!
x / y * z
1.5

Functions, arguments, and methods

Functions, arguments, and methods form the core user framework for Python programming.

  • Functions: Perform actions on a thing

  • Argument: The “things” (text, values, expressions, datasets, etc.)

Note “parameters” are the variables during function definition. Arguments are the values we pass into these placeholders while calling the function.

  • Methods: Type-specific functions (i.e., can only use a specific type of data and not other types). Use “dot” notation to utilize methods on a variable or other object.

For example, you could type gap = pd.read_csv('data/gapminder-FiveYearData.csv') to use the read_csv() method from the pandas library (imported as the alias pd) to load the Gapminder data.

Data types

Everything in Python has a type which determines how we can manipulate that piece of data. Be careful, it is easy to get confused when trying to complete multiple tasks that use lots of different variables!

# float (decimals)
# use a decimal to create floats
pi = 3.14
print(type(pi))
<class 'float'>
# integer (whole numbers)
# do not use a decimal for integers
amount = 4
print(type(amount))
<class 'int'>
# string (text)
# wrap text data in quotations
welcome = "Welcome to Stanford Libraries"
print(type(welcome))
<class 'str'>
# boolean (logical)
# True or False (stored as 1 and 0)
print(type(True))
print(False - True)
<class 'bool'>
-1

Addition examples with strings versus numbers

# character strings
'1' + '1'
'11'
# integers
1 + 1
2

Data structures

Data can be stored in a variety of ways.

Indexing

Python is a zero-indexed programming language and means that you start counting from zero. Thus, the first element in a collection is referenced by 0 instead of 1.

List

Lists are ordered groups of data that are both created and indexed (positionally referenced) with square brackets [].

animals = ['shark', 'dolphin']
animals[0]
'shark'
animals = ['shark', 'dolphin', ['dog', 'cat'], ['tree', 'cactus']]
print(animals[3][0])
print(animals[2][1])
tree
cat

Dictionary

Dictionaries are unordered groups of “key:value” pairs. Use the key to access the value.

apple = {'name': 'apple', 'color': ['red', 'green'], 'recipes': ['pie', 'salad', 'sauce']}
orange = {'name': 'orange', 'color': 'orange', 'recipes': ['juice', 'marmalade', 'gratin']}

fruits = {'fruits': [apple, orange]}

fruits
{'fruits': [{'name': 'apple',
   'color': ['red', 'green'],
   'recipes': ['pie', 'salad', 'sauce']},
  {'name': 'orange',
   'color': 'orange',
   'recipes': ['juice', 'marmalade', 'gratin']}]}
fruits['fruits'][1]['recipes'][0]
'juice'

Import text data as a character string

Import text using the open().read() Python convention to import text as a single string.

frank = open('data/frankenstein.txt').read()

# print only the first 1000 characters
print(frank[:1000])
The Project Gutenberg eBook of Frankenstein, by Mary Wollstonecraft (Godwin) Shelley

This eBook is for the use of anyone anywhere in the United States and
most other parts of the world at no cost and with almost no restrictions
whatsoever. You may copy it, give it away or re-use it under the terms
of the Project Gutenberg License included with this eBook or online at
www.gutenberg.org. If you are not located in the United States, you
will have to check the laws of the country where you are located before
using this eBook.

Title: Frankenstein
       or, The Modern Prometheus

Author: Mary Wollstonecraft (Godwin) Shelley

Release Date: 31, 1993 [eBook #84]
[Most recently updated: November 13, 2020]

Language: English

Character set encoding: UTF-8

Produced by: Judith Boss, Christy Phillips, Lynn Hanninen, and David Meltzer. HTML version by Al Haines.
Further corrections by Menno de Leeuw.

*** START OF THE PROJECT GUTENBERG EBOOK FRANKENSTEIN ***




Frankenstein;

or, the Modern Prom

Import data frames with the pandas library

Data frames are programming speak for tabular spreadsheets organized into rows and columns and often stored in .csv format.

# Step 1. link the pandas library to our current notebook
import pandas as pd
# Step 2. enter the file path in pandas's read_csv() function  
gap = pd.read_csv("data/gapminder-FiveYearData.csv")
# Step 3. view the data
print(gap)
          country  year         pop continent  lifeExp   gdpPercap
0     Afghanistan  1952   8425333.0      Asia   28.801  779.445314
1     Afghanistan  1957   9240934.0      Asia   30.332  820.853030
2     Afghanistan  1962  10267083.0      Asia   31.997  853.100710
3     Afghanistan  1967  11537966.0      Asia   34.020  836.197138
4     Afghanistan  1972  13079460.0      Asia   36.088  739.981106
...           ...   ...         ...       ...      ...         ...
1699     Zimbabwe  1987   9216418.0    Africa   62.351  706.157306
1700     Zimbabwe  1992  10704340.0    Africa   60.377  693.420786
1701     Zimbabwe  1997  11404948.0    Africa   46.809  792.449960
1702     Zimbabwe  2002  11926563.0    Africa   39.989  672.038623
1703     Zimbabwe  2007  12311143.0    Africa   43.487  469.709298

[1704 rows x 6 columns]
gap
country year pop continent lifeExp gdpPercap
0 Afghanistan 1952 8425333.0 Asia 28.801 779.445314
1 Afghanistan 1957 9240934.0 Asia 30.332 820.853030
2 Afghanistan 1962 10267083.0 Asia 31.997 853.100710
3 Afghanistan 1967 11537966.0 Asia 34.020 836.197138
4 Afghanistan 1972 13079460.0 Asia 36.088 739.981106
... ... ... ... ... ... ...
1699 Zimbabwe 1987 9216418.0 Africa 62.351 706.157306
1700 Zimbabwe 1992 10704340.0 Africa 60.377 693.420786
1701 Zimbabwe 1997 11404948.0 Africa 46.809 792.449960
1702 Zimbabwe 2002 11926563.0 Africa 39.989 672.038623
1703 Zimbabwe 2007 12311143.0 Africa 43.487 469.709298

1704 rows × 6 columns

Challenge

Open JupyterLab. Try to import a:

  1. different .txt file

  2. different .csv file

If you encounter error messages, which ones?

Error messages

Python’s learning curve can feel creative and beyond frustrating at the same time. Just remember that everyone encounters errors - lots of them. When you do, start debugging by investigating the type of error message you receive.

Scroll to the end of the error message and read the last line to find the type of error.

Challenge

  1. In JupyterLab, unhashtag the line of code for each error message below

  2. Run each one

  3. Inspect the error messages

Syntax errors

Invalid syntax

You have entered something python does not understand.

# x 89 5

Indentation

Your indentation does not conform to the rules

### indentation
# def example():
#     test = "this is an example function"
#     print(test)
#      return example

Runtime errors

Name

You try to call a variable you have not yet assigned

# x

Or, you try to call a function from a library that you have not yet imported

# example()

Type

You write code with incompatible types

# "5" + 5

Index

You try to reference something that is out of range

my_list = ['green', True, 0.5, 4, ['cat', 'dog', 'pig']]
# my_list[5]

File errors

File not found

You try to import something that does not exist

# document = open('fakedocument.txt').read()