Recently I found the need to access a page external to my app and get the links of all the images for subsequent downloading.
I found some code readily enough but it was for Python 2.7 – here is the code working Python 3.8:
from urllib.request import urlopen
from bs4 import BeautifulSoup
def get_images_in_page(self, url):
html_page = urlopen(url).read()
soup = BeautifulSoup(html_page, features="html5lib")
images = []
for img in soup.findAll('img'):
images.append(img.get('src'))
return images
# Lets get some images!
images = get_images_in_page("http://example.com")
print(images)
Obviously use this with caution. It allows you to scrape a webpage – something that website owners usually frown upon.
In this case I was scraping my own page on a different server – so I had already given permission to myself 🙂