This repository has been deprecated, but is being kept online to preserve course links.
For the latest content please see the repository at:
https://umd-ischool-inst326.github.io/inst326/
Overview
This module introduces the core concepts of web scraping, i.e. extracting data from unstructured or semi-structured data sources online. We learn to use the BeautifulSoup module and experiment with a number of examples
Lecture Videos
Exercise
Learning Outcomes
After completing this module, students should be able to:
-
Analyze HTML pages to identify repeating structural elements that support web scraping
-
Use the BeautifulSoup module
-
Extract data from web pages with web scraping techniques