note to self....

http://blogs.earthside.org/note_to_self/

Wednesday, November 02, 2005

getazlyric.pl

#!/usr/bin/perl -w
#
# getazlyric.pl - gets an lyric from azlyrics.com
#

# SYNTAX
#
# getazlyric.pl <artist> <song> <file>
#
# OPERATION
# This is a wrapper for html2text, azlyricfilter, and tee.
# The lyric is retrieved by html2text with HTML markup
# converted to text formating. The azlyricfilter program
# strips the remaining text formatting, including text
# tagging added by azlyrics.com ("[name] LYRICS" header

# and www.azlyrics.com link text at the bottom).
#
# The cleaned text file is saved to <file> and dumped to
# stdout.
#
# The "Done" message that is sent to stdout is not included
# in the saved file.
#
# PARAMETERS
#
# The format of artist names and and lyric file/song title

# file names at azlyrics.com is all lower case, no spaces
# or special characters.
#
# The base URL for azlyrics.com lyric files is
#
# http://www.azlyrics.com/lyrics/<artist>/<song>.html
#
# so to get a file, pick pass the name of the artist and the
# name of the song formatted as per above (no spaces, lower
# case - do not include the .html on the song name).
#
# Note that it might be a good idea to confirm that the
# artist/song file exists before trying to retreive it,

# since the default behaviour by azlyrics.ccom when a
# file is not found is to return an index of some kind.
#
# The filename passed as <file> should be a writable filename
# - no warning is given if <file> is being overwritten.
#
# Revision History:
# 2005-11-01: pdwilso@gmail.com - initial revision
#
# TO DO:
#

# Add command line switch processing to use a "quiet" mode
# where no output is sent to stdout.
#
# Add a switch to output only to stdout - for now,
# use /dev/null as localfile to suppress file output.
#
# Add a test to assure that <artist>/<song>.html exists on
# azlyrics.com - fail with warning if URL doesn't exist.
#
# Add a switch to issue a warning/query if <file> exists.

#
# ##############################################################
#
# "constants" - command variables
$tee = "tee";
$html2text = "html2text -nobs";
$azfilter="/usr/local/bin/azlyricfilter.pl";
#

# get parameter values
unless (scalar(@ARGV)==3) {
print "Usage: ".__FILE__." artist song localfile\n";
exit;
}
$artist = shift;

$song = shift;
$localfile = shift;
#
# construct the URL of the song from
$url = "http://www.azlyrics.com/lyrics/$artist/$song.html";
#
# backtick execution causes the output of the command to be
# returned by the command - the tee command causes the output
# of the command string to be returned by the backtick command
# execution mechanism - the print statement places the output
# of the command in backticks on std out

print `$html2text $url | $azfilter | $tee $localfile`;
#
# done - might want to remove this
print "\n *** Done.\n";





<< Home

Archives

2004/09   2005/03   2005/04   2005/05   2005/06   2005/07   2005/08   2005/09   2005/10   2005/11   2006/01   2006/02   2006/04   2006/05   2006/06   2008/01  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]