good evening dear friends here at devshed
well - I have some troubles with a perl-script that turns out to be not 100% optimal. now i am tryin to find a better solution either in perl or ruby - but if you have ideas to re-work the perl-script. i would be glad too.
The question: Is there a way to specify Net::Telnet timeout with WWW::Mechanize::Firefox?
At the moment my internet connection [a quite fast dsl one] is very slow and sometimes I get error
[PHP]
with $mech->get():
command timed-out at /usr/local/share/perl/5.12.3/MozRepl/Client.pm line 186[/PHP]
[PHP]SEE THIS ONE: $mech->repl->repl->timeout(100000);
[/PHP]
Unfortunatly it does not work: Can't locate object method "timeout" via package "MozRepl"
Documentation says this should:
[PHP]$mech->repl->repl->setup_client( { extra_client_args => { timeout => 1 +80 } } );
[/PHP]
problem: I have a list of 2500 websites and need to grab a thumbnail screenshot (!) of them. How do I do that?
I could try to parse the sites either with Perl.- Mechanize would be a good thing.
Note: i only need the results as a thumbnails that are a maximum 240 pixels in the long dimension.
At the moment i have a solution which is slow and does not give back thumbnails:
How to make the script running faster with less overhead - spiting out the thumbnails
[PHP]
My prerequisites: addon/mozrepl/
the module WWW::Mechanize::Firefox;
the module imager
[/PHP]
This is my source ... see a snippet [example]of the sites i have in the url-list.
urls.txt [the list of sources in a file]
www.google.com
www.cnn.com
www.msnbc.com
news.bbc.co.uk
www.bing.com
www.yahoo.com - and so on and so forth...:
What i have tried allready; here it is:
[PHP]
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize::Firefox;
my $mech = new WWW::Mechanize::Firefox();
open(INPUT, "<urls.txt") or die $!;
while (<INPUT>) {
chomp;
print "$_\n";
$mech->get($_);
my $png = $mech->content_as_png();
my $name = "$_";
$name =~s/^www\.//;
$name .= ".png";
open(OUTPUT, ">$name");
print OUTPUT $png;
sleep (5);
}[/PHP]
Well this does not care about the size:
See the output commandline:
[PHP]linux-vi17:/home/martin/perl # perl mecha_test_1.pl
www.google.com
www.cnn.com
www.msnbc.com
command timed-out at /usr/lib/perl5/site_perl/5.12.3/MozRepl/Client.pm line 186
linux-vi17:/home/martin/perl #
[/PHP]
Question: how to extend the solution either to make sure that it does not stop in a time out. Note again: i only need the results as a thumbnails that are a maximum 240 pixels in the long dimension.
As a prerequisites, i allready have installed the module imager.
How to make the script running faster with less overhead - spiting out the thumbnails
i also tried out this one here:
[PHP]$mech->repl->repl->setup_client( { extra_client_args => { timeout => 5*60 } } );
[/PHP]
putting links to @list and use eval
[PHP]
while (scalar(@list)) {
my $link = pop(@list);
print "trying $link\n";
eval{
$mech->get($link);
sleep (5);
my $png = $mech->content_as_png();
my $name = "$_";
$name =~s/^www\.//;
$name .= ".png";
open(OUTPUT, ">$name");
print OUTPUT $png;
close(OUTPUT);
}
if ($@){
print "link: $link failed\n";
push(@list,$link);#put the end of the list
next;
}
print "$link is done!\n";
}
[/PHP]
Question: is there a Ruby / Python /PHP-Solution that runs more efficient - or can you suggest a Perl-solution that is more stable..
Look forward to hear from you
Thx for any and all help in advance
have a great day
greetings
your unleash