Markdown Cheatsheet

As a new user to Markdown I was looking for a cheatseet. For some reason the meaning of cheatsheet has become diluted to mean a bunch of scrolling text that would print out to multiple pages with inconvenient breaks; the first 5 suggestions from Google were such. Fortunately Jeremy Stretch has produced a sane pdf cheatsheet that can be viewed or printed out as a convenient single page desktop reference.

Image of Jeremy Stretch's Markdown Cheatsheet

Jeremy’s blog PacketLife.net has a whole stack of well formatted IT cheat sheets, mostly networking related, but worth a browse including a single page Mediawiki cheatsheet.

Maryspeak a command line wrapper for MaryTTS

This project and the documentation from my blog is now hosted on GitHub the links in the post have been changed accordingly.

What is Maryspeak

I’ve been inspired by Ken Fallon via Hacker Public Radio to write maryspeak; a small java program  to make the core features of Mary Text To Speech  readily accessible as a shell command. The aim of maryspeak is to reduce the friction for Linux shell users to use MaryTTS. It accepts text input via the command line, via a file or via stdin, processes it using MaryTTS, then outputs speech via sound, to a file, or to stdout. It also allows for the selection of a voice and/or a MaryTTS server.

MaryTTS is written in Java and the UI is not transparently accessible to the Linux command line. maryspeak is a wrapper around the MaryInterface classes used by the MaryTTS Java and http clients. Because its written in Java there is no reason this cannot also be on a Windows system, but the command switches are closer to the GNU conventions.

Installing Maryspeak

This installation assumes Debian but the same principles will apply to other distros.

  • Install MaryTTS (see my previous post Marytts Voice Synthesizer How-To for Debian)
  • Download the maryspeak.jar from here or download the source to build it for yourself here
  • Copy the maryspeak.jar file to the MaryTTS library
    $ sudo cp maryspeak.jar /opt/marytts-5.1/lib
  • Create an alias to provide a nice clean usable command. In Debian add the following line to the .bash_aliases file, create the file if it does not already exist.
    alias maryspeak='java -cp "/opt/marytts-5.1/lib/*" -Dmary.base=/opt/marytts-5.1 maryspeak.Maryspeak'
  • Log out of your session and log back in to pick up the new alias, alternately you can source the .bashrc to refresh the session
    $ source ~/.bashrc

Using Maryspeak

You can use maryspeak standalone or against a MaryTTS server.

  • For the simplest demonstration of maryspeak working, it can speak an internal default phrase by using the –default parameter
  • $ maryspeak --default
  • Say what you want by just appending your text to the command. The full stop at the end is required, else for some reason the speech is too slow.
    $ maryspeak This is a short statement from your computer.
  • If you wish to process the speech on a MaryTTS server use the –host=servername parameter. If your server is on the local machine you can just use –host. It is even possible use the MaryTTS demo server with –host=mary.dfki.de
    $ maryspeak --host You will need to run the MaryTTS Server locally to hear this spoken.
  • Show the full Maryspeak usage instructions with -h or –help
    $ maryspeak --help

Further Exploration/Exploitation of MaryTTS

Maryspeak only offers a subset of the functionality of MaryTTS. Maryspeak is also not using the streaming capabilities of the MaryTTS library, so processes things in series: gather input, process the input, then output audio.

MaryTTS has a rich depth of functionality beyond that used by Maryspeak. For serious use I would recommend investigating this.

MaryTTS can be used directly from command line if required via:

$ java -cp "classpath" [properties] marytts.client.http.MaryHttpClient [inputfile]

Assuming MaryTTS is installed as per my previous post, the instructions for usage of this class can be obtained by compiling and running the following java class:

public class ShowUsage {
    public static void main(String[] args) {
    marytts.client.http.MaryHttpClient.usage();
    }
}

For your convenience I have done this, for MaryTTS version 5.1, this is the output:

usage:
 java [properties] marytts.client.http.MaryHttpClient [inputfile]
Properties are: -Dinput.type=INPUTTYPE
                -Doutput.type=OUTPUTTYPE
                -Dlocale=LOCALE
                -Daudio.type=AUDIOTYPE
                -Dvoice.default=male|female|de1|de2|de3|...
                -Dserver.host=HOSTNAME
                -Dserver.port=PORTNUMBER
 where INPUTTYPE is one of TEXT, RAWMARYXML, TOKENS, WORDS, POS,
 PHONEMES, INTONATION, ALLOPHONES, ACOUSTPARAMS or MBROLA,
 OUTPUTTYPE is one of TOKENS, WORDS, POS, PHONEMES,
 INTONATION, ALLOPHONES, ACOUSTPARAMS, MBROLA, or AUDIO,
 LOCALE is the language and/or the country (e.g., de, en_US);
 and AUDIOTYPE is one of AIFF, AU, WAVE, MP3, and Vorbis.
 The default values for input.type and output.type are TEXT and AUDIO,
 respectively; default locale is en_US; the default audio.type is WAVE.
inputfile must be of type input.type.
 If no inputfile is given, the program will read from standard input.
The output is written to standard output, so redirect or pipe as appropriate.

So for a quick demo try putting some text into a file test.txt (use a full stop at the end of your text or Mary speaks slowly for some reason) then run:

$ java -cp "/opt/marytts-5.1/lib/*" marytts.client.http.MaryHttpClient test.txt | aplay

Note: that the output is piped into aplay which can play back a .wav stream. If you don’t have aplay installed you can > output to a file.wav to play later.

Much more can be done using the full MaryTTS libraries within a Java program.

MaryTTS voice synthesizer How to for Debian

Mary Text To Speech logoMaryTTS is an open-source, multilingual Text-to-Speech Synthesis platform written in Java (homepage http://mary.dfki.de/). I’ve taken an interest in it after it was featured on Hacker Public Radio Episode 1599. As a Java program it should run anywhere, however here is how to get it to work on a Debian Linux machine.

Download the MaryTTS runtime package from the link on the download page:
http://mary.dfki.de/download/index.html

$ cd Downloads
$ wget https://github.com/marytts/marytts/releases/download/v5.1/marytts-5.1.zip

Unzip the application to the /usr/bin directory

$ sudo unzip marytts-5.1.zip -d /opt

At this point it will not run unless the you have already installed Java 1.7 you can determine the current version of Java by executing:

$ java -version

Install the required version of Java (also add openjdk-7-jdk if you intend to do any java development):

$ sudo apt-get install openjdk-7-jre

After installing the new java runtime (jre) it will still not be the default. To set the new jre to your default use:

$ sudo update-alternatives --config java

Selection Path Priority Status
------------------------------------------------------------
* 0 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 auto mode
1 /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java 1061 manual mode
2 /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java 1051 manual mode

Press enter to keep the current choice[*], or type selection number: 2

Having selected option 2 the java version should return something similar to:

$ java -version
java version "1.7.0_65"
OpenJDK Runtime Environment (IcedTea 2.5.1) (7u65-2.5.1-5~deb7u1)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)

The runtime package delivers the scripts necessary to run the MaryTTS Server, which can be used via a browser of the client to synthesize speech. The server can be launched with:

$ /opt/marytts-5.1/bin/marytts-server.sh

This can then be used either through a browser or via the MaryTTS Client. The browser address will be:

http://localhost:59125

The MaryTTS Client, which is a Java GUI can be launched with:

$ /opt/marytts-5.1/bin/marytts-client.sh

In addition to the server and client components there is the MaryTTS Component Installer, which can be used to install additional voices and apply any available updates to the voices (the server comes with a single us female voice as a default). To launch the installer:

$ /opt/marytts-5.1/bin/marytts-component-installer.sh

Once the installer is running click [Update] to fetch the latests selection of voices. Buttons are then available to install or remove voices.

Introduction to Linux course now free online

Tux with a mortar boardThe Linux Foundation ‘Introduction to Linux’ course is now available via the edX online training system.

This course is a version of the classroom based course that has been converted to run on the edX system. This is currently the only Linux foundation course on edX, however they do have other IT courses available supplied by other partner institutions.

Introduction to Linux can be found at https://www.edx.org/course/linuxfoundationx/linuxfoundationx-lfs101x-introduction-1621.

This course is ideal for newcomers to Linux and will help fill a few gaps for those who have dabbled a bit but have not been formally trained. It is truly a beginners course though and those with any experience of linux will find themselves skipping through the earlier parts wondering if its worthwhile. However once past the adverts for Linux in the early part it proves to be useful and covers a breadth of topics giving the learner enough for them to take things further.

The course uses example distros from SUSE Centos and Ubuntu. To get the most value from the course I would recommend creating 3 virtual machines to play with as you proceed, since its useful to appreciate the differences between the distros.

You can take the course for free or you can pay for a validated certificate at the end. You also have the option of signing up for a free ‘honour’ certificate where your identity has not been validated by edX but a certificate is issued anyway.

Do you remember floppy disc head cleaners?

5.25″ floppy drive

Actually floppy disc head cleaners have an interesting story. Back in the late 80’s and early 90’s they were a pretty rubbish product, boring, low tech. and low profit. One company evolved this into a top selling minor triumph of technology.

Back when Floppy discs were the main storage game in town and all PCs ran DOS, the heads of floppy disk drives used to get clagged up and then wouldn’t be able to read the data from the disks. In an industrial environment this might be due to oily residue or other grime, but in those less enlightened times, the worst culprit was tobacco smoke residue; which the fans in the computers used to suck in, or blow out, through the slot in the front of the drives. This required regular cleaning to remove the varnish like layer off the drive’s heads.

The original low tech head cleaners were a disk of stiff cloth in a floppy disk jacket onto which you would decant a few drops of cleaning solvent. On inserting the disk you’d get a very brief period of cleaning whilst the drive tried to find the index track of the disc, then an annoying prompt appeared on the screen announcing a disk error as the PC failed to boot from the disc. They were next to useless; very little real cleaning took place and then the operator (often a secretary, as these machines were generally used as word processors) would get a meaningless and often confusing message about a disk error.

So from a user standpoint the original floppy disc cleaners were almost useless, and from a marketing point of view they weren’t great either, once someone bought one the only re-sale opportunity was if they lost the thing.

Telematic Micro Limited were a company specialising in floppy disc test and maintenance systems which would enable the head alignment to be adjusted and set up on early floppy drives. At several hundreds of pounds a pop, I wasn’t likely to be selling a lot of them. However these guys really understood floppy disc drives. When I pointed out to them the opportunity for a better floppy disc head cleaner they came up with HeadMax the intelligent floppy cleaner for PCs.

They created a floppy disc with a mildly abrasive cleaning surface printed onto the inner 40 tracks and a disc cleaning programme on the outer tracks which was derived from their disc test software. This solved the user experience problem; the disc would boot and load a cleaning program which measured the head signal from a test track, it then applied a cleaning cycle repeatedly until the signal from the head was good and the disk heads were clean. Each cleaning cycle used up one of the 40 cleaning tracks and showed a diagnostic display of the drive. This was the ideal floppy cleaner, it had a good user experience with a nice onscreen display of the cleaning progress, it could be sold for a premium price, it counted down its usage and prompted a repurchase when used up, which was great for repeat sales. This product was an amazing success and did great business. Later versions of the product were produced for Windows 95 called DriClean.

Amazingly the Telematic Micro web site is still online preserved from 1998 at telematic.co.uk where you can see the original products, their marketing blurb, and a brief history of the floppy disc.

Installing or updating VirtualBox Client Additions in Crunchbang

If running Crunchbang in VirtualBox, or any other Linux distro, it’s really useful to add the VirtualBox Additions. This provides access to share directories on the host file system, dynamic desktop resizing, clipboard sharing and access to the USB devices on the host.

Initially it was not obvious how to get this to install in Crunchbang. There were a number of things to overcome. The user account does not have sufficient permission to execute the installer on the disk, also by default the required software to compile the linux kernel modules is not there.

So this is what’s I found to be successful:

  • In the VirtualBox window menu, select Devices -> Insert Guest Additions CD image …
  • In CrunchBang open the File Manager, this will for the CD image to automount
  • Close the File Manager and any accompanying notifications
  • Open the terminal and sudo to a root prompt
    $sudo su - root
  • Update the system to ensure that all the installed packages are up to date
    #apt-get update
    #apt-get upgrade
  • Now add the packages required for the Kernel Module compilation
    #apt-get install build-essential module-assistant
  • Then run the VirtualBox Client Additions installation
    #sh /media/cdrom/VBoxLinuxAdditions.run
  • Having installed the Additions exit root (Ctrl-D) and close the terminal Ctrl-D)

To install the inevitable updates, follow the same process but omit the apt-get install command.

The BBC: good value?

I grew up before the days of the Internet, listening to BBC Radio 4; much of what I learned of society, history and science has been from the Radio. The BBC continues to produce high quality informative educational and entertaining programmes. Many of which I consume as podcasts; currently I have the following BBC podcasts in my feed.

  • In Our Time – Melvyn Bragg  (BBC Radio 4)
  • Outriders (BBC Radio 5)
  • Comedy of the Week (BBC Radio 4)
  • Mark Kermode and Simon Mayo’s Film Reviews (BBC 5 Live)
  • Friday Night Comedy (BBC Radio 4)
  • Comedy of the Week (BBC Radio 4)
  • Discovery (BBC World Service)
  • More or Less: Behind the Stats (BBC Radio 4)

These are paid for by the TV License. For my £145.50 a year I get these podcasts, all the TV and Radio output of the BBC, iPlayer and all the related supporting material on the Internet. Oh yes and the rest of my household get all that too.

The BBC have had a massive positive influence on the adoption of technology and attitudes towards it in the UK.

In the early 80’s Acorn found its feet making the BBC Micro, went on to become ARM, whose microprocessor architecture is now in everybody’s iPhones, iPads and Android devices and masses of other equipment.

In 1986 the BBC Doomsday Project gave us a vision of the future of multimedia presentation, with 1 million contributing to a digital archive of the UK.

In the 90’s the BBC website set the standard for presentation of quality on-line news and web content.

BBC iPlayer pushed the boundaries for streamed media and once again set the standard for other broadcasters to follow.

More recently we’ve seen the digital switch-over,  in which the BBC played a major role.

With a few notable exceptions having the BBC there, providing content and spurring technical innovation, raises the bar for the rest of the media. It’s a shame that the recent issues with senior management, remuneration, golden handshakes etc., give politicians the opportunity to question their continued funding.

Removing unwanted posts from Stikked paste bin

Stikked is a pretty handy open source paste-bin application; that I’ve been using as a private paste-bin. It lets you set a time-out on a paste so as it cleans up after you’ve used the paste. Unfortunately occasionally I leave it set at ‘Keep-Forever’. It doesn’t have an admin interface or a delete option for the pastes, so over these unwanted pastes build up in the database.

I failed to find  a guide to delete posts from Stikked on the web, though there were people who had asked how they could remove unwanted posts. So I put this short guide together.

In the browser display the paste that you wish to delete. The paste id is shown at the end of the URL, make a note of the paste ID(s) you wish to remove. The paste IDs appear as an eight character hexadecimal string such as 8ae4ae1a.

Now we have the IDs we can log into the command line on the server and delete the pastes directly from the mysql database.

Log into mysql (you will need to have the root password or an account that has rights over the database).

$ mysql -u root -p

Change to the correct database

List the databases

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| paste              |
| performance_schema |
| phpmyadmin         |
| test               |
+--------------------+
7 rows in set (0.00 sec)

In this case the database was called ‘paste’. Switch to the correct database.

mysql> use paste;

Show the tables in the paste database

mysql> show tables;
+-----------------+
| Tables_in_paste |
+-----------------+
| blocked_ips     |
| ci_sessions     |
| pastes          |
| trending        |
+-----------------+
4 rows in set (0.00 sec)

Stikked has a relatively simple schema with only 4 tables only two of which relate to pastes.

Describe the pastes table

mysql> describe pastes;
+--------------+---------------------+------+-----+---------+----------------+
| Field        | Type                | Null | Key | Default | Extra          |
+--------------+---------------------+------+-----+---------+----------------+
| id           | int(10)             | NO   | PRI | NULL    | auto_increment |
| pid          | varchar(8)          | NO   | MUL | NULL    |                |
| title        | varchar(50)         | NO   |     | NULL    |                |
| name         | varchar(32)         | NO   |     | NULL    |                |
| lang         | varchar(32)         | NO   |     | NULL    |                |
| private      | tinyint(1)          | NO   | MUL | NULL    |                |
| raw          | longtext            | NO   |     | NULL    |                |
| created      | int(10)             | NO   | MUL | NULL    |                |
| expire       | int(10)             | NO   |     | 0       |                |
| toexpire     | tinyint(1) unsigned | NO   |     | 0       |                |
| snipurl      | varchar(64)         | NO   |     | 0       |                |
| replyto      | varchar(8)          | NO   | MUL | NULL    |                |
| ip_address   | varchar(16)         | YES  | MUL | NULL    |                |
| hits         | int(10)             | NO   | MUL | 0       |                |
| hits_updated | int(10)             | NO   | MUL | 0       |                |
+--------------+---------------------+------+-----+---------+----------------+
15 rows in set (0.00 sec)

The pastes table holds the actual pastes in the ‘raw’ field which is a longtext string. If we have the paste id from the URL we can select for that paste.

If you wish you can check that you have the right paste before deleting the item.

mysql> SELECT pid,title,name FROM pastes WHERE pid='8ae4ae1a';

If you need to see the content to be sure you can add some of the raw data (100 characters) to your query, this will probably not look very pretty as any return characters will disrupt the layout.

mysql> SELECT pid,title,name,left(raw,100) FROM pastes WHERE pid='8ae4ae1a';

Once you are sure that this is the post to delete execute the delete statement with the same ‘where’ clause.

mysql> DELETE FROM pastes WHERE pid='8ae4ae1a'

This will have removed your paste. To be tidy the entries for this paste in the trending table can also be removed.

mysql> DELETE FROM trending WHERE paste_id='8ae4ae1a';

If you have a list of items to remove you can use this alternate where clause, using the ‘in’ keyword.

mysql> DELETE FROM pastes WHERE pid IN  ('8ae4ae1a','<p-id2>',...,'<p-idn>');
mysql> DELETE FROM trending WHERE paste-id IN ('8ae4ae1a>','<p-id2>',...,'<p-idn>');

Quit mysql

mysql> quit

Other cases for deleting pastes would be where your paste bin has been spammed, in which case you could select pastes to delete based on the spammer’s ip_address. In this case you would need to tidy up the trending table with a more resource hungry where clause.

mysql> DELETE FROM trending WHERE NOT paste-id IN (SELECT pid FROM paste);

Free on-line services, or do you roll your own?

The first time I was hit by a ‘free’ service being withdrawn was probably the worst time. It was back in the late 90’s when I used a ‘free’ on-line chat forum service as an adjunct to a website I created for my old school’s reunion website. This service was withdrawn with 3 days notice and the site’s popularity never really recovered from the blow.

For the most part these ‘free’ services are monetised through the users identity being sold as a merchandising opportunity. Sometimes these services are used as a taster or sales lead for a paid service with higher functionality, quality or quantity of the service. Some  ‘free’ services are closer to being genuinely free, such those offered by open source projects in support of their activities, although increasingly the commercial realities of running these services lead to advertising support of some kind.

Given my early experience with the chat-room, I’ve always used these services with my eyes half-open knowing that I’m not getting something for nothing, making a calculated decision about the compromise being made and the relative security of the content invested in the service. Over the years I’ve been caught out a number of times with ‘free’ services being withdrawn. More recently it has been Google services being withdrawn, admittedly with adequate notice, but all the same, with some inconvenience.

This year I’ve started to run or manage for myself some of the on-line services I use. The first was this blog. Previously I used Posterous and Google’s Blogger; The Posterous service was withdrawn and after the withdrawal of Google Reader, I’m not convinced that Blogger will be around for the long term.

There’s a cost to running your own services, either paying for a hosting service or running your own machine from home. Although it might seem like a cost free option to run your services from a home PC, in reality the cost of the electricity to leave it running 24/7 (approx. £1 per watt per year) may outstrip the cost of a low cost shared hosting service, which will also usually provide some form of additional backup and resilience. If you can squeeze you requirements into an ARM based machine like a Raspberry Pie or a low powered Atom based machine it might be worth running your own. So unles you have some very particular requirements its worth considering a hosted solution.

When I started out I just wanted to run this Blog and I managed to buy a year’s hosting from Tsohost for £12.99, although it would be £14.99 without the discount code I used. So far the service has been good.

The other services I’ve started to run for myself are a Google Reader replacement and a Paste Bin, but there’s a long list left where I need to decide on whether to continue to accept the compromise.