Use Python to get alerted when an Amazon wishlist item drops in price


Scratching my own itch. I want an alert when there's been a price drop on an item on my Amazon wishlist. I couldn't find an easy way to get an email directly from Amazon (customer-focused my shiny metal arse) so I knocked something up in Python. This is heavily inspired by Leigh Dodds' Wishlist Monitor.

Amazon don't offer an API for wishlists (innovative my shiny metal arse). So this uses Beautiful Soup to grab the data from the HTML. To be fair, there's also some microdata on the page, which makes things slightly easier.

The code is available on github.com/edent/Amazon-Wishlist-Pricedrop-Alert - but here's a brief walkthrough of parsing and dealing with pagination:

First up, the standard boilerplate for getting and parsing HTML

Python 3 Python 3from bs4 import BeautifulSoup
import requests
import re
import json

I start by passing this function my wishlist URL https://www.amazon.co.uk/gp/registry/wishlist/1A1NYHTAZ3N6V/. Amazon only returns 10 items per page of HTML, but it includes a "pagination" URL at the end of each page. Later on, we'll recursively call this function based on the "next page" URL.

Python 3 Python 3def get_wishlist(url):
    #   Load the wishlist
    response = requests.get(url)
    page_html = response.text
    #   Parse the page
    soup = BeautifulSoup(page_html, 'html.parser')
    return soup

The item name - for example "Fly Fishing by JR Hartley" - needs to be scraped out of the HTML. It looks something like this:

HTML HTML<a id="itemName_IS7IZUKCMTBZG" 
  class="a-link-normal"
  title="Fly Fishing by JR Hartley"
  href="...">
   Fly Fishing by JR Hartley
</a>

Use BeautifulSoup to grab the string inside the anchor. Use a regex to match anything starting with itemName_.

Python 3 Python 3def get_items(soup):
    #   Get the item names
    for match in soup.find_all('a', id=re.compile("itemName_")):
        item = match.string.strip()
        item_list.append(item)

Prices and product IDs are slightly easier to scrape. They have microdata and JSON. For example:

HTML HTML<li data-id="1A1NYHTAZ3N6V"
   data-itemId="I3VQB497DB6HJF"
   data-price="8.07"
   data-reposition-action-params="{"hasComparisonTable":false,"itemExternalId":"ASIN:B0962VD2G8|A1F83G8C2ARO7P","listType":"WishList","sid":"260-3142506-2992056"}"
   class="a-spacing-none g-item-sortable">...

The JSON in data-reposition-action-params parses out to:

JSON JSON{
  "hasComparisonTable": false,
  "itemExternalId": "ASIN:B0962VD2G8|A1F83G8C2ARO7P",
  "listType": "WishList",
  "sid": "260-3142506-2992056"
}

Beautiful soup can grab the data-price and then the Amazon ID:

Python 3 Python 3def get_prices_and_ids(soup):
    #   Get the price and ID from data attributes
    for match in soup.find_all("li", class_="g-item-sortable"):
        price = match.attrs["data-price"]
        price_list.append(price)
        json_data = json.loads(match.attrs["data-reposition-action-params"])
        # Will be something like "ASIN:B095PV5G87|A1F83G8C2ARO7P"
        id = json_data["itemExternalId"].split(":")[1].split("|")[0]
        id_list.append(id)

That will get us all the names, IDs, and prices on the first page. What about subsequent pages?

At the bottom of the page is this HTML:

HTML HTML<input
  type="hidden"
  name="showMoreUrl"
  value="/hz/wishlist/slv/items?filter=unpurchased&paginationToken=eyJGcm9tVV...MDl9&itemsLayout=LIST&sort=default&type=wishlist&lid=1A1NYHTAZ3N6V"
  class="showMoreUrl"/>

Retrieving the URL gives another HTML page with the next set of wishlist items on it. The last page of results has <div id="endOfListMarker"> which we can use to detect when to stop getting pages.

Python 3 Python 3def get_paginator(soup):
    paginator = None
    ##  Find the paginator
    if soup.find("div", {"id": "endOfListMarker"}) is None:
        #   If the end tag doesn't exist, continue
        for match in soup.find_all('input', class_="showMoreUrl"):
            paginator = "https://www.amazon.co.uk" + match.attrs["value"]
    else:
        paginator = None
    return paginator

Putting it all together looks like this:

Python 3 Python 3#   Set up the lists
item_list  = []
price_list = []
id_list    = []

#   Message text
message = "Here are the recent price drops:\n"

def get_all(url):
    global counter
    counter = counter + 1
    print( "Getting page " + str(counter))
    soup = get_wishlist(url)
    get_items(soup)
    get_prices_and_ids(soup)
    paginator = get_paginator(soup)
    if paginator is not None:
        get_all(paginator)

#   Get all the items on the wishlist
#   Which page are we on?
counter = 0
get_all("https://www.amazon.co.uk/gp/registry/wishlist/1A1NYHTAZ3N6V/")

That stores all the data in some lists. Next up, put them in a Pandas DataFrame for ease of access:

Python 3 Python 3#   Place into a DataFrame
all_items = zip(id_list, item_list, price_list)
new_prices = pd.DataFrame(list(all_items), columns = ["ID", "Name", "Price"])

For comparing prices, I'm just being lazy. All I care about is if the price has dropped since the last time I looked. I don't care what the highest or the lowest price is.

I save the prices as a CSV. The next time the code runs it reads that CSV into a DataFrame called old_prices

Python 3 Python 3#   Read the old file
if exists("old_prices.csv"):
    old_prices = pd.read_csv("old_prices.csv")
else:
    old_prices = new_prices.copy()

Now it's a case of iterating through the new prices and comparing each item with its old price.

If the price has dropped, send me a message. If the price of a book is under a quid, I also want to know that as well.

Python 3 Python 3#   Compare prices
for id in new_prices["ID"]:
    new_price = new_prices.loc[new_prices["ID"]==id, "Price"].values[0]
    name      = new_prices.loc[new_prices["ID"]==id, "Name" ].values[0]
    #   If a book has recently been added to the wishlist, it won't have an old price
    if id in old_prices.values:
        old_price = old_prices.loc[old_prices["ID"]==id, "Price"].values[0]
        #   Anything less than a quid is good knowing about.
        #   Some prices are ""-Infinity", so check the price is more than zero
        if float(new_price) < 1 and float(new_price) > 0:
            message += (name + "\n£" + str(new_price) + " was £" + str(old_price) + " https://www.amazon.co.uk/dp/"+id + "\n")
        elif float(new_price) < float(old_price) and float(new_price) > 0:
            message += (name + "\n£" + str(new_price) + " was £" + str(old_price) + " https://www.amazon.co.uk/dp/"+id + "\n")

print(message)

There's a slight wrinkle because some of the prices are -Infinity

That's usually when the item is unavailable.

Next, overwrite the old_files.csv with today's prices - we'll use these tomorrow as the old prices.

Python 3 Python 3#   Save the Data
new_prices.to_csv('old_prices.csv', index=False)

Finally, how does the message get sent? It's OK to print something on the console - but I want to get the alerts via email.

This uses smtplib to send something via your own SMTP server. You'll need to add your own email server details here!

Python 3 Python 3import smtplib
from email.message import EmailMessage

#   Send Email
def send_email(message):
    email_user = 'me@example.com'
    email_password = 'P455w0rd!'
    to = 'you@example.com'
    msg = EmailMessage()
    msg.set_content(message)
    msg['Subject'] = "Today's price drops"
    msg['From'] = email_user
    msg['To'] = to
    server = smtplib.SMTP_SSL('smtp.example.com', 465)
    server.ehlo()
    server.login(email_user, email_password)
    server.send_message(msg)
    server.quit()

And that's pretty much it! Stick it in a crontab and have it run once per day. Amazon can get a bit funny about screen scraping, so best not to run it too often or you'll hit an IP-based CAPTCHA.

There are a couple of minor annoyances.

  • Some prices fluctuate by pennies. Perhaps you only want to know if the price is lower by more than £1, or more than 10%?
  • No tracking the lowest price. Do you want to know if it's the lowest price in the last few weeks?
  • Doesn't automatically buy stuff. It would be nice to just buy a book any time it drops to be lower than a quid.

The full code is available on github.com/edent/Amazon-Wishlist-Pricedrop-Alert.

Enjoy!


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

4 thoughts on “Use Python to get alerted when an Amazon wishlist item drops in price”

  1. says:

    This is great, thank you! If (like me) you prefer to use distro packages for python dependencies rather than pip then something like this may help - in this example I'm using Ubuntu 22.04:

    sudo apt install python3-bs4 python3-certifi python3-charset-normalizer python3-idna python3-numpy python3-pandas python3-dateutil python3-pytzdata python3-requests python3-six python3-soupsieve python3-urllib3

    Works a treat. 🙂

    Reply

What links here from around this blog?

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">