
PHP Website Email and WhatsApp Scraper Tutorial
In this tutorial we will build a simple but powerful PHP script that reads a list of websites from a text file and automatically extracts email addresses, WhatsApp numbers and phone numbers from those websites.
This type of script is useful for building contact databases, research projects, or collecting publicly available contact information from company websites.
The script works by reading a file containing URLs, opening each website using cURL, scanning the HTML for email patterns and WhatsApp links, and saving the results to another file.
Features of the Script
- Reads website URLs from a text file
- Automatically opens each website
- Extracts email addresses
- Extracts WhatsApp numbers
- Extracts phone numbers
- Processes websites in batches to avoid server timeout
- Saves results to a new file
- Works on most shared hosting servers
Required Files
Create a folder on your server and place the following files inside it:
- index.php
- scan.php
- functions.php
- jata.txt
- results.txt
Sample Input File
The file jata.txt contains your data including the website URLs.
Company ABC | https://example.com Company XYZ | https://example.org Travel Agency Japan | https://agency.co.jp
File 1: index.php
This file creates a simple control panel to start the scanning process.
<?php
$total = 0;
if(file_exists("jata.txt")){
$lines=file("jata.txt");
$total=count($lines);
}
?>
<h2>Website Contact Scraper</h2>
<p>Total websites in file: <?php echo $total; ?></p>
<a href="scan.php?start=0">Start Scanning</a>
<br><br>
<a href="results.txt">Download Results</a>
File 2: scan.php
This script processes websites in batches to prevent server timeout.
<?php
include "functions.php";
set_time_limit(0);
$start = isset($_GET['start']) ? intval($_GET['start']) : 0;
$limit = 20;
$lines = file("jata.txt");
$total = count($lines);
$end = min($start+$limit,$total);
echo "<h3>Scanning $start to $end</h3>";
for($i=$start;$i<$end;$i++){
$line = $lines[$i];
$url = get_url($line);
$email="";
$wa="";
$phone="";
if($url){
$html = get_page($url);
if($html){
list($email,$wa,$phone)=extract_contacts($html);
}
}
$newline = trim($line)."|EMAIL:$email|WHATSAPP:$wa|PHONE:$phone\n";
file_put_contents("results.txt",$newline,FILE_APPEND);
echo "Processed $url <br>";
}
$next = $end;
if($next < $total){
echo "<script>
setTimeout(function(){
window.location='scan.php?start=$next';
},2000);
</script>";
echo "Continuing automatically...";
}else{
echo "<h2>Scanning Completed</h2>";
}
File 3: functions.php
This file contains the scraping functions that extract contact information from the website HTML.
<?php
function get_url($line){
if(preg_match('/https?:\/\/[^\s|"]+/i',$line,$m)){
return $m[0];
}
return "";
}
function get_page($url){
$ch=curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_TIMEOUT,10);
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/5.0");
$html=curl_exec($ch);
curl_close($ch);
return $html;
}
function extract_contacts($html){
$email="";
$wa="";
$phone="";
if(preg_match('/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}/i',$html,$m)){
$email=$m[0];
}
if(preg_match('/wa\.me\/([0-9]+)/',$html,$m)){
$wa=$m[1];
}
if(!$wa && preg_match('/api\.whatsapp\.com\/send\?phone=([0-9]+)/',$html,$m)){
$wa=$m[1];
}
if(preg_match('/\+?[0-9][0-9\-\s]{8,15}/',$html,$m)){
$phone=$m[0];
}
return [$email,$wa,$phone];
}
Running the Script
Upload all files to your server and open the following URL in your browser:
https://www.yoursite.com/scraper/
Click the Start Scanning link. The script will process about 20 websites at a time and continue automatically until all websites are scanned.
The extracted email addresses, WhatsApp numbers and phone numbers will be saved in the file results.txt.
Tips for Better Results
- Ensure the input file contains valid URLs
- Do not scan too many websites at once
- Always respect website terms of service
- Use the script only for publicly available information
Conclusion
This simple PHP website crawler demonstrates how to extract contact information such as email addresses and WhatsApp numbers from websites automatically. With further improvements, the script can also extract social media links, company details and location information.
Advertisement
Please Like this page
Our Free ArticleTips for finding best Hosting company
Choosing the best web designer
How to make your site user friendly
Search Engine Promotion
Copy of content on this site is not allowed without prior permission. Please contact us for permission
-
<h2 style="margin-top:20px;font-size: 30px;">Not Found
<head>
<!DOCTYPE html>
</title><style>@media (prefers-color-scheme:dark){body{background-color:#000!important}}</style></head>
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no" />