Introduction
Web scraping is the process of harvesting, mining, or extracting data from a web page using some software, this software makes requests to the website as a human would, then parses' (reads) the response based on rules specified by the developer.
The data extracted can be saved into a file, or put into a database.
There are several web scraping tools, but in this article, we shall be exploring the scrapy library and its integration with Django.
Assumptions
I am assuming the following:
- you know the basics of python development
- you have or can get python installed on your machine and
- you know how to install python libraries using pip.
With that out of the way, let's get building.
Before we get into integrating scrapy with Django, let us first get a feel of the scrapy library.