Mark As Completed Discussion

Introduction

Web scraping is the process of harvesting, mining, or extracting data from a web page using some software, this software makes requests to the website as a human would, then parses' (reads) the response based on rules specified by the developer.

The data extracted can be saved into a file, or put into a database.

There are several web scraping tools, but in this article, we shall be exploring the scrapy library and its integration with Django.

Assumptions

I am assuming the following:

  • you know the basics of python development
  • you have or can get python installed on your machine and
  • you know how to install python libraries using pip.

With that out of the way, let's get building.

Before we get into integrating scrapy with Django, let us first get a feel of the scrapy library.