PHP Website Email and WhatsApp Scraper Tutorial

$title

In this tutorial we will build a simple but powerful PHP script that reads a list of websites from a text file and automatically extracts email addresses, WhatsApp numbers and phone numbers from those websites.

This type of script is useful for building contact databases, research projects, or collecting publicly available contact information from company websites.

The script works by reading a file containing URLs, opening each website using cURL, scanning the HTML for email patterns and WhatsApp links, and saving the results to another file.

Features of the Script

  • Reads website URLs from a text file
  • Automatically opens each website
  • Extracts email addresses
  • Extracts WhatsApp numbers
  • Extracts phone numbers
  • Processes websites in batches to avoid server timeout
  • Saves results to a new file
  • Works on most shared hosting servers

Required Files

Create a folder on your server and place the following files inside it:

  • index.php
  • scan.php
  • functions.php
  • jata.txt
  • results.txt

Sample Input File

The file jata.txt contains your data including the website URLs.

Company ABC | https://example.com
Company XYZ | https://example.org
Travel Agency Japan | https://agency.co.jp

File 1: index.php

This file creates a simple control panel to start the scanning process.

<?php
$total = 0;

if(file_exists("jata.txt")){
$lines=file("jata.txt");
$total=count($lines);
}
?>

<h2>Website Contact Scraper</h2>

<p>Total websites in file: <?php echo $total; ?></p>

<a href="scan.php?start=0">Start Scanning</a>

<br><br>

<a href="results.txt">Download Results</a>

File 2: scan.php

This script processes websites in batches to prevent server timeout.

<?php

include "functions.php";

set_time_limit(0);

$start = isset($_GET['start']) ? intval($_GET['start']) : 0;

$limit = 20;

$lines = file("jata.txt");

$total = count($lines);

$end = min($start+$limit,$total);

echo "<h3>Scanning $start to $end</h3>";

for($i=$start;$i<$end;$i++){

$line = $lines[$i];

$url = get_url($line);

$email="";
$wa="";
$phone="";

if($url){

$html = get_page($url);

if($html){

list($email,$wa,$phone)=extract_contacts($html);

}

}

$newline = trim($line)."|EMAIL:$email|WHATSAPP:$wa|PHONE:$phone\n";

file_put_contents("results.txt",$newline,FILE_APPEND);

echo "Processed $url <br>";

}

$next = $end;

if($next < $total){

echo "<script>
setTimeout(function(){
window.location='scan.php?start=$next';
},2000);
</script>";

echo "Continuing automatically...";

}else{

echo "<h2>Scanning Completed</h2>";

}

File 3: functions.php

This file contains the scraping functions that extract contact information from the website HTML.

<?php

function get_url($line){

if(preg_match('/https?:\/\/[^\s|"]+/i',$line,$m)){
return $m[0];
}

return "";
}

function get_page($url){

$ch=curl_init();

curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_TIMEOUT,10);
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/5.0");

$html=curl_exec($ch);

curl_close($ch);

return $html;

}

function extract_contacts($html){

$email="";
$wa="";
$phone="";

if(preg_match('/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}/i',$html,$m)){
$email=$m[0];
}

if(preg_match('/wa\.me\/([0-9]+)/',$html,$m)){
$wa=$m[1];
}

if(!$wa && preg_match('/api\.whatsapp\.com\/send\?phone=([0-9]+)/',$html,$m)){
$wa=$m[1];
}

if(preg_match('/\+?[0-9][0-9\-\s]{8,15}/',$html,$m)){
$phone=$m[0];
}

return [$email,$wa,$phone];

}

Running the Script

Upload all files to your server and open the following URL in your browser:

https://www.yoursite.com/scraper/

Click the Start Scanning link. The script will process about 20 websites at a time and continue automatically until all websites are scanned.

The extracted email addresses, WhatsApp numbers and phone numbers will be saved in the file results.txt.

Tips for Better Results

  • Ensure the input file contains valid URLs
  • Do not scan too many websites at once
  • Always respect website terms of service
  • Use the script only for publicly available information

Conclusion

This simple PHP website crawler demonstrates how to extract contact information such as email addresses and WhatsApp numbers from websites automatically. With further improvements, the script can also extract social media links, company details and location information.


Advertisement


Search the internet.

Please Like this page


Our Free Article

How to Find Best Keyword

How to make a Website

How to Use Chatgpt Wisely

Benifits of Hosting in UK

Tips for finding best Hosting company

Sementic Web Elements

Choosing the best web designer

Tips for Optimizing a Page

How to create a Blog

How to make your site user friendly

Magic for your Site

Importance of SEO

Making a user friendly site

Making God Facebook page

What is Web hosting

What Should be my domain name

How To Promote your web site

Search Engine Promotion

 

Copy of content on this site is not allowed without prior permission. Please contact us for permission

    <h2 style="margin-top:20px;font-size: 30px;">Not Found
    <head>
    <!DOCTYPE html>
    </title><style>@media (prefers-color-scheme:dark){body{background-color:#000!important}}</style></head>
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no" />