Sqoop Error – Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]

While importing data from a MySQL table I came across this error:


This was using Cloudera 5.0.2, the Sqoop command that was being run was fine on the command line but failed when using Oozie to schedule it. As with most Hadoop logs there was little in the logs to suggest what the actual error was. in order to give myself a chance at finding the error I enabled verbose logging in the sqoop command.

This revelaed that Oozie was unable to find the mysql-connector.jar and the derby.jar (for interfacing with hive). Once I added these to the workflow directory in HDFS the job completed successfully



Sqoop export failing in CDH5

When doing an export using Sqoop I got this error:

root@c2nn1:/mnt/usb/tweets# sqoop
Warning: /opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
/opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/bin/../lib/sqoop/bin/sqoop: line 101: /usr/lib/hadoop/bin/hadoop: No such file or director

This can be resolved by running

rm -rf /usr/lib/hadoop

As there are no files in that directory


Ambari hangs when installing Hadoop

While installing Hadoop using the automated install method offered by Ambari I would constantly get to 4% then the install would fail, investigation in the logs showed that this was down to Puppet timing out. Further investigation revealed that this was because the server was running using a DHCP address, once this was changed to a static IP the install continued as normal,


EE wont replace faulty iPhone 5

After many pointless emails to EE and their Executive Office they have finally issued their final word on the matter of replacing a faulty iPhone 5 battery:

As EE like to ignore the law I sent this to their CEO, Olaf Swantee:

Having searched online it is clear that we are not the only people fighting with EE to replace or repair a faulty iPhone5. To date we have received nothing from EE apart from the email above and continue to have a faulty iPhone, no reply from Olaf or the “Executive Office”


MapReduce and Flumes .tmp files

While using one of the many online guides for streaming tweets into Hadoop (particular thanks to ) if became apparant that as the volume of tweets increased MapReduce would fail to run any queries against the data. Typical errors would be

Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /user/flume/tweets/FlumeData.1395159404667.tmp

This is because the tmp files are getting written to the proper file name while MapRefuce is still holding them for processing. The way to combat this is to get MapReduce to ignore .tmp files. This requires the creation of a jar file and some ammendments to hive-site.xml. Not being a java programmer it took me a while to understand the process and most guides relied on using Eclipse, being an Ubuntu user I prefer NetBeans. Here is the precompiled jar file that I created based on the Jira issue for this.

The ammendments to the hive-site.xml are:

hive.aux.jars.path = file:///usr/lib/hadoop/twitterutil-1.0-SNAPSHOT.jar
mapred.input.pathFilter.class = com.twitter.util.FileFilterExcludeTmpFiles

These either need to be entered into Ambari or added (with formatting!) to hive-site.xml

One of the caveats I noticed was that the jar file needed to be under /usr/lib/hadoop rather than under /usr/lib/hadoop/lib like most of the other jar files


Load CSV in Hue

When creating a table in Hive based on a large CSV file the following error occurs;

IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[,], original=[,]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via ‘dfs.client.block.write.replace-datanode-on-failure.policy’ in its configuration.

This is caused by the number of nodes being too high in hdfs-site.xml, the default value of block.replication is 3, by dropping this to 2 (even with a 1 node cluster) resolved the issue.


Observium on FreeBSD

This is my guide to installing Observium on FreeBSD. I am using 9.1 patched with all the latest updates.

For some reason Observium would only run from /opt sp these are the commands I used to get a base install

mkdir /opt
cd /opt
cd /usr/ports/devel/subversion
make clean install
svn co http://www.observium.org/svn/observer/trunk/ observium
cd /opt/observium/
mv config.php.default config.php
mkdir rrd

At this point you need to login to the Mysql server and create a database and user for observium, once you have this you need to edit config.php with these details

php includes/update/update.php

appears to install the database, at this point I got a few errors about paths to things such as RRD, fping etc so you need to edit includes/defaults.inc.php to set them correctly. Once thats all done your ready to add the first user, the command is php ./adduser.php username password 10. Im running this on a server with other sites so I created a virtual host config and browsed to the URL and all appeared OK. Upon trying to use I was informed that I need mcrypt from php5-extensions and some python stuff. The extensions are difficult to install but to install python the command was cd /usr/ports/*/py-MySQLdb && make install clean


harfbuzz failing on Freebsd

While trying to install cacti on a production Freebsd 9.1 server it failed with

/usr/local/lib/libXrender.so: undefined reference to `_XEatDataWords'
gmake[2]: *** [hb-view] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[2]: Leaving directory `/usr/ports/print/harfbuzz/work/harfbuzz-0.9.19/util'
gmake[1]: *** [all-recursive] Error 1
gmake[1]: Leaving directory `/usr/ports/print/harfbuzz/work/harfbuzz-0.9.19'
gmake: *** [all] Error 2
===> Compilation failed unexpectedly.
Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to
the maintainer.
*** [do-build] Error code 1

Stop in /usr/ports/print/harfbuzz

Turns out that you need to reinstall libXrender from ports


Xubuntu VNC Session

I recently had the need for a remote VNC server, I found a really good guide here but that was for Gnome, being a fan of Xubuntu I wanted to use this. Once I had the VPS details I logged in and installed xubuntu-desktop package and followed the guide. I had real trouble trying to get VNC to start with Xubuntu, after much googling I found that the xstartup needed to be:


# Uncomment the following two lines for normal desktop:
#exec /etc/X11/xinit/xinitrc

[ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
xsetroot -solid grey
vncconfig -iconic &
x-terminal-emulator -geometry 80×24+10+10 -ls -title “$VNCDESKTOP Desktop” &
x-window-manager &
startxfce4 &


Ubuntu 13.04

So 13.04 is here, but whats the big deal?

Well visually none, but the biggest feature I notice is the “Friends” app, a replacement for Gwibber. Yes its nice to have all the new features etc but the one thing that Ubunutu truly needs is an Exchange compatible email client, yes I know about the workarounds like Davmail etc but thats a hack in my book. While I still use Ubuntu across all my work and personnel devices, including my tablet, this is the one feature that I could really use.

Looking forward to Saucy Salamander though!