Report LinksWe do not store any files or images on our server. XenPaste only index and link to content provided by other non-affiliated sites. If your copyrighted material has been posted on XenPaste or if hyperlinks to your copyrighted material are returned through our search engine and you want this material removed, you must contact the owners of such sites where the files and images are stored.
Scrapy masterclass: Python web scraping and data pipelines
Published 11/2022
Created by Ahmed Elfakharany
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning | Language: English | Duration: 40 Lectures ( 5h 44m ) | Size: 2.75 GB
Work on 7 real-world web-scraping projects using Scrapy, Splash, and Selenium. Build data pipelines locally and on AWS
What you'll learn
Extract data from the most difficult web sites using Scrapy
Build ETL pipelines and store data in CSV, JSON, MySQL, MongoDB, and S3
Avoid getting banned and evade bot-protection techniques
Use Splash for scraping JavaScript-powered websites
Harness the power of Selenium browser automation to scrape any website
Deploy your Scrapy bots in local and AWS environments Requirements
Some Python background
All projects are run on Python 3.10 so it needs to be installed
Familiarity with Linux is recommended but not strictly required
Familiarity with the HTTP protocol and HTML Description
This is the era of data! Everyone is telling you what to do with the data that you already have. But how can you "have" this data?Most of the Data Engineering / Data Science discussions today focus on how to analyze and process datasets to draw some useful information out of them. However, they all assume that those datasets are already available to you. That they've been collected somehow. They spend little time showing how you can obtain this dataset firsthand! This course fills this gap.Scrapy for building powerful web scraping pipelines is all about walking you through the process of extracting data of interest from websites. True, there are a lot of datasets already available for you to consume either for free or at some cost. However, what if those datasets are outdated? What if they don't address your specific needs? You'd better know how to build your own dataset from...