Ever needed to get all the urls from a string or a web page? No? Well, here is a script that will do it anyway. Basically it works on grabbbing anything beginning with http, or https, up to the next double comma ” or a space. The results are then returned in an array where you can extract them for your use. If you find a better regex to do this, let me know and we will use it.
<?php
$string = ‘<a href=”http://www.example.com”>Example.com</a> has many links with
examples <a href=”http://www.example.net/file.php”>links</a> to many sites and
even urls without links like http://www.example.org just to fill the gaps and
not to forget this one http://phpro.org/tutorials/Introduction-to-PHP-Regex.html
which has a space after it. The script has been modifiied from its original so now
it grabs ssl such as https://www.example.com/file.php also’;
/**
*
* @get URLs from string (string maybe a url)
*
* @param string $string
*
* @return array
*
*/
function getUrls($string)
{
$regex = ‘/https?\:\/\/[^\” ]+/i’;
preg_match_all($regex, $string, $matches);
return ($matches[0]);
}
$urls = getUrls($string);
foreach($urls as $url)
{
echo $url.'<br />’;
}
?>
The above script will output a list of urls from the string like this..
http://www.example.com
http://www.example.net/file.php
http://www.example.org
http://phpro.org/tutorials/Introduction-to-PHP-Regex.html
https://www.example.com/file.php