/Users/edsu/inst326/slides.pdf
Image Source: Smithsonian American History Museum
File and folder "containment" is represented by filesystem "paths":
Absolute
/Users/edsu/inst326/slides.pdf
Relative
inst326/slides.pdf
In addition to a path, every file has a particular "format"
The format determines how the data is written to the file
The simplest is plain text
We will also look at CSV and JSON
speech.txt is an example of a text file. Here are a few things to notice about text files:
text files often have a .txt file extension
text files have lines separated by newline characters
text files have an encoding, usually Unicode
Use the open function to open a file using the file’s path as a parameter. Use the file object’s read method to read the contents of the file into a variable.
fh = open('speech.txt')
text = fh.read()
print(text)
You can also use the open function to open a file for writing by passing in w as a second argument to open. This then allows you to write data to a file.
fh = open('sonnet.txt', 'w')
fh.write('So long as men can breathe, or eyes can see,\n')
fh.write('So long lives this, and this gives life to thee.\n')
fh.close()
You can use a for loop to iterate through the lines in a file object.
Why might it be important to be able to read a file line by line instead of all at once?
for line in open('speech.txt'):
print(line)
While it would be possible to read a CSV file as a text file, Python’s csv module helps you do it.
import csv
fh = open('energy.csv')
spreadsheet = csv.reader(fh)
for row in spreadsheet:
print(row)
The csv.DictReader class uses the column headers in your CSV file to create a dictionary for each row.
import csv
fh = open('energy.csv')
spreadsheet = csv.DictReader(fh)
for row in spreadsheet:
print(row['State'], row['Solar'])
You can also use the csv.writer class to write a CSV file row by row.
import csv
fh = open('salaries.csv', 'w')
spreadsheet = csv.writer(fh)
spreadsheet.writerow(['Name', 'Age', 'Department'])
spreadsheet.writerow(['Val', 19, 'Physics'])
spreadsheet.writerow(['Rick', 22, 'English'])
spreadsheet.writerow(['Hope', 20, 'Information Studies'])
spreadsheet.close()
Unfortunately, not all data fits neatly into tables. What makes this example hard to represent as a table?
people = [
{
"name": "Val",
"interests": ["astronomy", "hockey"]
},
{
"name": "Rick",
"interests": ["karaoke"]
}
]
The interests can have one to many values.
Python comes with a json module which makes it easy to read JSON using the json.load function. We’ll use it to load this JSON file of tweet data: aoc.json.
import json
fh = open('aoc.json')
tweets = json.load(fh)
for tweet in tweets:
print(tweet['hashtags'])
You can also use the json.dump function to save a data structure to a file.
import json
people = [
{"name": "Val", "interests": ["astronomy", "hockey"]},
{"name": "Rick", "interests": ["karaoke"]}
]
fh = open('data.json', 'w')
json.dump(people, fh)
We covered a lot of territory learning about input and output operations:
Files and Paths
read & write Text files
read & write CSV files
read & write JSON files