Sales-Ready Company Research Automation using n8n & Bright Data
This is a submission for the AI Agents Challenge powered by n8n and Bright Data
What I Built
I built an AI-powered sales research automation that creates detailed company research documents in under 12-15 minutes.
Current Problem – Sales reps are spending 2+ hours manually researching each prospect before calls, often missing crucial information or showing up unprepared.
When someone books a call through any form, this automation kicks in and does all the heavy lifting – checking website health, enriching company data through Apollo.io, scraping relevant pages, finding recent news, scraping LinkedIn profiles, and compiling everything into a professional Google Doc that helps sales reps be prepared for the call.
Demo
n8n Workflow

Himanshi-098
/
Company-Research-Document-_-n8n-and-BrightData-Challenge
Conducts comprehensive research about the company and generates automated research document
Sales-Ready Company Research Automation using n8n & Bright Data
Overview
The automation creates clean and detailed sales-ready research document of you lead and their company in under 12-15 minutes. This automation helps you research the company and generate a detailed report containing company details, financial information, recent developments, technology focus, and executive biography.
Features
- Automated Lead Qualification: Determines if leads are worth pursuing based on website scrapability and data availability
- Company Data Enrichment: Pulls revenue, employee count, industry, and founding information via Apollo.io
- Website and Link Categorization: Categorizes and scrapes relevant pages (About Us, Services, Financial Information)
- News & Development Tracking: Finds and summarizes recent company news and growth updates
- Executive Research: Locates and extracts LinkedIn profiles and their professional backgrounds
- Document Generation: Creates structured and professional Google Docs with all research compiled
- Slack Integration: Sends qualification status with green/red flags analysis of…
Technical Implementation
The system uses three interconnected n8n workflows:
-
Main Automation (Workflow Part 3):
- Trigger: Lead submits form whose details are updated on Google Sheets with name, email, website, call time
- Workflow Control: Kicks off Part 1, processes results, routes to Part 2 if qualified
- Final Assembly: AI composes professional Google Doc and Slack message content
- Output: Updates Google Drive with research document + sends Slack notification (Qualified/Disqualified + flags + doc link) about the lead
-
Website Research Flow (Workflow Part 1):
- Checks website scrapability using health parameters
- Enriches company data via Apollo.io API (revenue, employees, industry, founded date, etc)
- Scrapes homepage and categorizes URLs into About Us, Services, and Financials
- AI agent selects most relevant URL for detailed scraping from each category of About Us, Services, and Financials
- Scraps the selected URLs from each category through BrightData
- Fallback mechanism tries email domain if primary website fails
- Creates disqualification documents for non-scrapable leads
-
News & Executive Research Flow (Workflow Part 2):
- Searches recent company news using targeted Google queries
- AI agent picks most relevant news article for scraping
- Searches financial information through various keyword combinations
- Scraps the selected URL from news article and financial information through BrightData
- Finds executive LinkedIn profiles via Google search and scraps the most relevant one
Key AI Tasks:
- Smart URL Selection: Picks the most relevant pages from dozens of homepage links
- Content Quality Filter: Chooses genuinely relevant news over random company mentions
- Research Synthesis: Combines multiple data sources into coherent executive summaries
- Qualification Assessment: Analyzes all data to determine lead quality with specific flags
Bright Data Verified Node
Bright Data is the core engine that makes this automation possible – handling all the complex web scraping that would otherwise break constantly:
- Homepage Intelligence: Scrapes company homepages and extracts all available URLs which get categorized by AI into relevant sections
- News Research: Performs Google searches for recent company news and financial updates, returning titles, links, and snippets
- LinkedIn Extraction: Uses the batch extraction feature to pull comprehensive professional data from executive profiles. The system initiates extraction, gets a snapshot ID, then loops to check status until data is ready for download
- Website Health Checks: Validates if URLs are scrapable before attempting extraction, preventing failures down the line
- Fallback Searches: When primary websites fail, it searches alternative domains through Google to find scrapable alternatives
The Bright Data nodes handle all the complex web scraping while the AI agents focus on selecting the right content and summarizing it into actionable insights.
Journey
After working into the field of research for approximately 1.5 years, I realized sales rep spend more than 2-hour researching about the client. Even after spending this much time we tend to miss some important information anyway.
Initially, I thought this would be a simple “scrape website and make document” workflow. But I was wrong! Websites are incredibly unpredictable – some block scrapers, others have weird structures, and many just don’t have the information you need.
Biggest Challenge: Getting the website health checks right. So many websites block scrapers or have weird structures that break automation. Had to build multiple fallback mechanisms and error handling to make sure the system doesn’t just crash when it hits a problematic site.
The LinkedIn Loop: Figuring out Bright Data’s batch extraction for LinkedIn was tricky. You initiate the extraction, get a snapshot ID, then have to keep checking if it’s ready. Took me several iterations to get the status monitoring loop working smoothly.
AI Agent Tuning: Getting the AI to pick the RIGHT URLs from search results was harder than expected. Had to refine the prompts multiple times so it would choose genuinely useful pages instead of random company mentions.
What I Learned:
- Always build fallbacks for web scraping – websites are unpredictable
- AI agents need very specific instructions to make good decisions
- Status monitoring loops are essential for batch operations
- Error handling is just as important as the happy path