Heroku pg:psql to CSV export (locally)

Heroku has a nice UI for that called Dataclips but IMHO command line still beats it. I needed to run an SQL query and export the results as CSV on a local machine and here is a quick little snippet to do just that:

heroku pg:psql -c "\COPY (SELECT id, name, email FROM users WHERE created_at BETWEEN '2016-01-01' AND '2016-12-31' ORDER BY created_at ASC) TO STDOUT CSV DELIMITER ',' HEADER" > users-2016.csv

Hive’s LOAD DATA fails to import many files with exception in org.apache.hadoop.hive.ql.exec.CopyTask

Interesting issue I came across recently, loading a large set of files that are coming from Localytics into Hive using Hive’s command line interface. The script that loads that data,basically contained something like 60k LOAD DATA statements that Hive was suppose to execute and LOAD DATA from each file into a table. This was all running smoothly on ElasticMapReduce, until, seemingly random exception, caused it to fail:

Failed with exception null
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.CopyTask

After some investigation I saw noticed a number of open files during the process was growing like crazy until it hit the limit which apparently was the root cause for the exception.

Some more googling around brought me to this unresolved bug report: https://issues.apache.org/jira/browse/HIVE-2485

I guess, not the most critical issue for Hive folks, but still not a pleasant one. My work around was the to split the .q files into chunks not larger then 28000 and iterate over them.

Simple chrooted FTP setup on EC2 micro instance

Source environment: Ubuntu

1. Install vsftpd
apt-get install vsftpd

2. Edit default config at /etc/vsftpd.conf

Make sure the you enable these:


# (default follows)

Ensure this is disabled:


and add the following to the end:


max, min ports could be anything high enough not to overlap with other services. Those ports will also need to be open in your security group if you’re using EC2

3. Create/edit /etc/vstfp.chroot_list
Add usernames that you don’t want to chroot.

4. Create users for FTP access:

adduser USERNAME

5. Ensure the home folder of a user is not writable(!) This is new since VSFTP 2.3.5 I believe.

chmod a-w /home/USERNAME

6. Create folders under /home/USERNAME for a user to upload stuff to, since a user won’t be able to upload to the root of /home/USERNAME

Ubuntu 11.10 or 12.04 fail to boot after upgrade due to software raid degrade/failure.

Since I first started using Ubuntu back in ’09 with 9.04 I had issues with my software RAID array roughly about every other time I am trying to upgrade to a newer version. Almost everytime the issue lies in GRUB not being able to install/update itself properly, so I end up just doing that manually from the rescue disk – process I have unintentionally learned by heart.

This time around, when I upgraded from 11.04 to 11.10 – it was a different issue. System failed to boot and dropped into initramfs/BusyBox with failure to assemble one of the software RAIDs. Apparently there was an update introduced in 11.10(I believe) that prevents system to boot if there is any software RAID array that it could not assemble fully. This could be an issue if for example your drives got mixed or, like in my case, I had one older RAID array defined that was not properly removed, but was always deactivated.

There is a pretty long, yet interesting conversation here on this matter: https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/872220

The way to solve this for me was to hit Ctrl-D when it dropped into initramfs/BusyBox, select ‘root shell’ and fix the issue – properly deactive the array I didn’t need and fix my working RAID array, that got degraded and needed to rebalance.

Oh, well… The Ubuntu upgrade process is still not there.

WordPress MU Upgrade and Permalinks 404 Erros

WordPress MU upgrade(from 2.x to 3.2.1) was a rather simple process surprisingly! Having completed it in a matter of couple hours for a fairly large blogging network, I was a happy camper up till the moment when permalinks started giving 404s.

What followed is painstaking process where I verified everysingle aspect of configuration from Apache’s mod_rewrite setup to htaccess rules to WordPress’s network site configs. Everything looked correctly.

Googling around for a good hour I came across this site, which pointed to the incompatibility of some plugins with WP3. In my case problem lied in a plugin called Top Level Categories, which I had to disable to get the permalinks working.