Serialization and File I/O

Introduction

File Paths

File and folder "containment" is represented by filesystem "paths":

  • Absolute

/Users/edsu/inst326/slides.pdf
  • Relative

inst326/slides.pdf

File Formats

  • In addition to a path, every file has a particular "format"

  • The format determines how the data is written to the file

  • The simplest is plain text

  • We will also look at CSV and JSON

Plain Text

What are Text Files?

speech.txt is an example of a text file. Here are a few things to notice about text files:

  • text files often have a .txt file extension

  • text files have lines separated by newline characters

  • text files have an encoding, usually Unicode

Read a Text File

Use the open function to open a file using the file’s path as a parameter. Use the file object’s read method to read the contents of the file into a variable.

fh = open('speech.txt')
text = fh.read()
print(text)

Writing a Text File

You can also use the open function to open a file for writing by passing in w as a second argument to open. This then allows you to write data to a file.

fh = open('sonnet.txt', 'w')

fh.write('So long as men can breathe, or eyes can see,\n')
fh.write('So long lives this, and this gives life to thee.\n')

fh.close()

Iterating

You can use a for loop to iterate through the lines in a file object.

Why might it be important to be able to read a file line by line instead of all at once?

for line in open('speech.txt'):
    print(line)

CSV

Read a CSV File

While it would be possible to read a CSV file as a text file, Python’s csv module helps you do it.

import csv

fh = open('energy.csv')
spreadsheet = csv.reader(fh)

for row in spreadsheet:
    print(row)

Read a CSV File

The csv.DictReader class uses the column headers in your CSV file to create a dictionary for each row.

import csv

fh = open('energy.csv')
spreadsheet = csv.DictReader(fh)

for row in spreadsheet:
    print(row['State'], row['Solar'])

Write a CSV File

You can also use the csv.writer class to write a CSV file row by row.

import csv

fh = open('salaries.csv', 'w')
spreadsheet = csv.writer(fh)

spreadsheet.writerow(['Name', 'Age', 'Department'])
spreadsheet.writerow(['Val', 19, 'Physics'])
spreadsheet.writerow(['Rick', 22, 'English'])
spreadsheet.writerow(['Hope', 20, 'Information Studies'])

spreadsheet.close()

Limitations of CSV

  • Unfortunately, not all data fits neatly into tables. What makes this example hard to represent as a table?

people = [
  {
    "name": "Val",
    "interests": ["astronomy", "hockey"]
  },
  {
    "name": "Rick",
    "interests": ["karaoke"]
  }
]
  • The interests can have one to many values.

JSON

Reading a JSON File

Python comes with a json module which makes it easy to read JSON using the json.load function. We’ll use it to load this JSON file of tweet data: aoc.json.

import json
fh = open('aoc.json')
tweets = json.load(fh)
for tweet in tweets:
    print(tweet['hashtags'])

Write a JSON File

You can also use the json.dump function to save a data structure to a file.

import json

people = [
  {"name": "Val", "interests": ["astronomy", "hockey"]},
  {"name": "Rick", "interests": ["karaoke"]}
]

fh = open('data.json', 'w')

json.dump(people, fh)

Conclusion

Summary

We covered a lot of territory learning about input and output operations:

  • Files and Paths

  • read & write Text files

  • read & write CSV files

  • read & write JSON files