Product Cataloging and intelligence

Product comparison sites scrape data from various e-commerce websites and provide user with a one-stop pricing details page. The goal is to similarly build a product database, extract vendor pricing information and perform analytics over the data.

Main system

The main system consists of the following modules :

  1. Master catalogue development (Scrape)
  2. Product dump extraction
  3. Vendor dump extraction
  4. Price dump extraction
  5. Product pricing sensitivity analysis
  6. Product master creation
  7. Intelligence to avoid screen scraping blockages
  8. HTML GUI query interface along with the required backend.

A more sophisticated model of our system is represented out by flowchart below:

Applications and Advantages

Tools Used

Future Improvements

Authors and Contributors

This tool was made by Karan Mangla (@manglakaran), Shailendra Joshi (@pshall), and Karan Aggarwal from International Institute of Information Technology, Hyderabad as a part of Information Retrieval and Extraction course project.

Support or Contact

Having trouble with Pages? Check out our video or our presentation or raise an issue and we’ll help you sort it out.


Information Retrieval and Extraction IIIT-H Major Project Analytics Crawlers Beautiful Soup MongoDb