🤖 Part 3: Building a Price Recommendation Engine with Pandas and Scikit-Learn

In Part 1 and Part 2, we created a pipeline to crawl smartphone prices and load them into Snowflake. Now, we’ll show how your Data Science team can use that data to:

  • Recommend products by price
  • Suggest the best day to buy
  • Predict future price drops

Let’s walk through a prototype using pandas, scikit-learn, and matplotlib.

📦 Data Snapshot

We’ll use a dataframe like the one below, pulled from the Snowflake PRICING.ECOM_PRICE_INDEX table:

import pandas as pd

df = pd.read_sql("SELECT * FROM PRICING.ECOM_PRICE_INDEX", connection)

df.head()
product_name product_price scraped_at ingestion_date
Galaxy S23 Ultra 1199.99 2024-06-01 08:00:00 2024-06-01
Galaxy S23 Ultra 1149.99 2024-06-05 08:00:00 2024-06-05
iPhone 14 Pro Max 999.99 2024-06-01 08:00:00 2024-06-01
iPhone 14 Pro Max 979.99 2024-06-10 08:00:00 2024-06-10

🧠 1. Recommend Lowest Prices by Product

recommendations = df.sort_values('scraped_at')     .groupby('product_name')     .agg(best_price=('product_price', 'min'),
         last_price=('product_price', 'last'),
         price_diff=('product_price', lambda x: x.iloc[-1] - x.min()))

📅 2. Find the Best Day to Buy

df['weekday'] = pd.to_datetime(df['scraped_at']).dt.day_name()

best_days = df.groupby(['product_name', 'weekday'])['product_price'].mean().reset_index()

best_day_to_buy = best_days.loc[best_days.groupby('product_name')['product_price'].idxmin()]

📉 3. Predict Future Price Drops

from sklearn.linear_model import LinearRegression
import numpy as np

df['day'] = (pd.to_datetime(df['scraped_at']) - pd.to_datetime(df['scraped_at']).min()).dt.days

predictions = []

for product in df['product_name'].unique():
    subset = df[df['product_name'] == product].sort_values('day')
    X = subset[['day']]
    y = subset['product_price']
    if len(X) < 3:
        continue

    model = LinearRegression().fit(X, y)
    next_day = np.array([[X['day'].max() + 1]])
    predicted_price = model.predict(next_day)[0]
    predictions.append({
        'product_name': product,
        'predicted_price_next_day': predicted_price,
        'trend': 'dropping' if predicted_price < y.iloc[-1] else 'rising'
    })

predicted_df = pd.DataFrame(predictions)

📊 Example Output

product_name predicted_price_next_day trend
Galaxy S23 Ultra 1129.45 dropping
iPhone 14 Pro Max 982.10 rising

🧠 Final Note

It’s tempting to jump into advanced AI models — but for many pricing problems, basic data analysis is more than enough.

  • 🧮 Simple statistics like averages and min/max reveal buyer trends
  • 📅 Time-based aggregation gives powerful forecasting signals
  • 🤖 ML can add predictive power, but isn’t always necessary

In many cases, using clean pipelines and descriptive analytics leads to faster insights and more trustworthy decisions than a complex AI model.

✅ Start with the data. Understand it. Then decide whether AI is worth the extra complexity.

Similar Posts