How to get all the image links on an external page using Python

Recently I found the need to access a page external to my app and get the links of all the images for subsequent downloading.

I found some code readily enough but it was for Python 2.7 – here is the code working Python 3.8:

from urllib.request import urlopen
from bs4 import BeautifulSoup

def get_images_in_page(self, url):
        html_page = urlopen(url).read()
        soup = BeautifulSoup(html_page, features="html5lib")
        images = []
        
        for img in soup.findAll('img'):
            images.append(img.get('src'))

        return images

# Lets get some images!
images = get_images_in_page("http://example.com")
print(images)

Obviously use this with caution. It allows you to scrape a webpage – something that website owners usually frown upon.

In this case I was scraping my own page on a different server – so I had already given permission to myself 🙂

Tags:

Leave a Reply