Home » Odeon Blogs »

Stefan Talpalaru, CTO

PEBKAC

It's official: prolonged sitting is a risk factor for all-cause mortality, independent of physical activity. Time to switch to a stand-up desk I you haven't done so already.


Category: 42

Leave a Comment

the worst die young

My Samsung SpinPoint F3 (HD103SJ) hard disk started exposing bad blocks to the operating system after 5210 hours of use (10 months from installation). For some reason it killed the kernel and compromised some filesystem structures preventing booting. At this point, the recommended procedure is to initiate the recovery immediately, copying whatever blocks you can copy to a new disk, but I was reluctant to buy a new hard disk now that the prices are doubled so I went ahead and ran fsck from SystemRescueCD. It worked, with the loss of a few replaceable files, but new bad blocks kept appearing so I caved in and bought a Seagate Barracuda 7200.12 (ST31000524AS) with the same 1TB capacity.


Using SystemRescueCD again I decided to clone individual partitions with GParted. It’s easier to clone the whole disk, but since the original GPT partition table was created on OS X, it wasted about 1.5GB on empty space between partitions (for future super-secret use or whatever reason the appletards have for doing it) and I wanted that space back. So I did the GPT creation and partition copy/paste dance with GParted, only to see it crash when told to retry the bad blocks. Sure enough, that wasn’t the tool I was looking for. Some more digging and I found ddrescue (not to be mistaken with dd_rescue). With some hints from the Forensics Wiki I was able to come up with the following strategy (sda is the bad disk, sdb is the good one):


  1. ddrescue -f -n /dev/sda1 /dev/sdb1 logfile1
  2. ddrescue -f -d -r3 /dev/sda1 /dev/sdb1 logfile1

The first command skips the bad blocks (-n) for a quick copy while the second one tries hard to retrieve them with direct access (-d) retrying for a maximum of 3 times (-r3). I ran the first command in a shell loop for all the partition pairs, and the second one for the 3 that reported bad blocks. One of them was recovered completely (after more than 3 retries, though), one still had 512 bytes left after a few runs and the last one got stuck on the ‘splitting’ phase for more than 15 minutes (weird behavior for approximately 16KB of unreadable data, but it seems that letting ddrescue run for hours/days is not uncommon).


After the recovery, I repaired the filesystems on sdb, eliminating the bad block lists copied from sda:


  1. fsck.ext4 -f -y -L /dev/null /dev/sdb1


Category: Linux

Leave a Comment

checking the availability of a zeromq endpoint

Zeromq sockets are the cat's meow but they do not provide connection status information. Of course, this is by design, so messages can be sent before the endpoint starts listening. There's some transparent buffering going on and in most cases this behavior is actually desired. Sometimes, though, we need to take different actions based on the availability of a server. A background task server, for example. If the server is not alive we might want to run the task in foreground instead of ditching it or letting it wait in the send buffer (think of a short lived client process).

The solution lies in what the zeromq guide calls the Lazy Pirate Pattern - make a pair of request-reply sockets, have the client make a request and wait for a reply within a given timeout. The example there is a bit too complicated, so here's a simplified version:

  1. #!/usr/bin/env python

  2. import zmq
  3. from zmq.eventloop import ioloop
  4. import argparse

  5. context = zmq.Context()
  6. ALIVE_URL = 'tcp://127.0.0.1:5556'
  7. ALIVE_TIMEOUT = 1000 # in ms

  8. def alive_handler(alive_socket, *args, **kwargs):
  9. request = alive_socket.recv()
  10. reply = 'pong'
  11. alive_socket.send(reply)

  12. def server():
  13. print 'starting the server'
  14. alive_socket = context.socket(zmq.REP)
  15. alive_socket.bind(ALIVE_URL)

  16. io_loop = ioloop.IOLoop.instance()
  17. io_loop.add_handler(alive_socket, alive_handler, io_loop.READ)
  18. io_loop.start()

  19. def client():
  20. print 'starting the client'
  21. alive_socket = context.socket(zmq.REQ)
  22. alive_socket.connect(ALIVE_URL)
  23. poll = zmq.Poller()
  24. poll.register(alive_socket, zmq.POLLIN)
  25. request = 'ping'
  26. alive_socket.send(request)
  27. socks = dict(poll.poll(ALIVE_TIMEOUT))
  28. if socks.get(alive_socket) == zmq.POLLIN:
  29. reply = alive_socket.recv()
  30. print 'the server is alive'
  31. else:
  32. print 'consider the server dead'
  33. # we can't use this socket any more
  34. alive_socket.setsockopt(zmq.LINGER, 0)
  35. alive_socket.close()
  36. poll.unregister(alive_socket)


  37. def main():
  38. parser = argparse.ArgumentParser(description='zeromq connection status checking demo')
  39. parser.add_argument('--server', action='store_true')
  40. parser.add_argument('--client', action='store_true')
  41. args = parser.parse_args()
  42. if args.server:
  43. server()
  44. elif args.client:
  45. client()
  46. else:
  47. parser.print_help()

  48. if __name__ == '__main__':
  49. main()


Please note that while closing the client socket on a reply timeout is not required for this example, it's necessary when you make multiple status checks and you reuse the same socket.


Category: Python

Leave a Comment

resizing TIFF files efficiently

Working with GeoTIFF leads sooner or later to files that have an uncompressed size bigger than the available disk space. This means that it's not possible to resize such a TIFF file with imagemagick (or graphicsmagick) because convert insists on using a temporary file for the raw image data. Sure enough, "converting" from TIFF to TIFF while resizing can be done on the fly, with a small memory buffer and that's exactly what gdal_translate (from the GDAL project) does. Here it is downsizing a huge file in a matter of seconds:

  1. gdal_translate -outsize 10% 10% -co COMPRESS=DEFLATE big.tif small.tif


Unfortunately the target size repetition is required, but other than that it's the perfect solution. It even preserves all the tags, unlike imagemagick.

later edit: if you intend to use the resulting file in ArcGIS 9 change the compression to 'LZW'.


Category: Linux

Leave a Comment

CHDK support for IXUS 220HS 1.01G

If you have a Canon IXUS 220HS (ELPH 300HS) camera with the 1.01G firmware version, this is your lucky day: now you can run CHDK on it.


Category: CHDK

Leave a Comment

stabilizing skype with gdb

We all know that Skype is a terrible piece of software, with the Linux version buggy and left behind the main code base, but lately it's been crashing like a Top Gear funny men in an overpowered car. Each time it starts and there are chat windows with new messages to be shown, the damn thing segfaults (at least on 64 bit, statically linked against Qt). While trying to find more details about the problem, I stumbled upon an unlikely fix: running skype under gdb seems to stabilize it. Is it the delay introduced by the debugger interfering with some concurrency bugs or some other mechanism at work? No idea. Just that this is how I run the rubbish right now:

LD_LIBRARY_PATH="/opt/skype" gdb /opt/skype/skype

and then type 'r' in the debugger. It hasn't crashed once under gdb so there's no stack trace available.


Category: Linux

Leave a Comment

learned it the hard way

If you enable the SandyBridge New Acceleration (SNA) in the xf86-video-intel driver on a GM965 chipset with a dual-monitor setup your GPU will hang with the kernel DRM driver crying like a little girl and both screens going blank. While the rest of xorg is still working and your current session is accessible through x11vnc, you will need to reboot to restore the video output.


Category: Linux

Leave a Comment

redis is the new memcached

One of the first things you learn about optimizing web applications is how to cache processed data in memory. Just install memcached - the easy to use key-value store - and you're set. Until you reach its limits, that is...

Memcached does not allow keys larger than 250 bytes or values bigger than 1MB. What happens when you really need bigger keys/values? You move on to redis.

If you are using a proper framework (and you should) there's probably already a redis plugin for you (here's the Django one). Now all you have to do is install redis and configure it to suit your caching needs. As a rule of thumb, speed and low resource usage are more important than data persistence. Here's my configuration (only the relevant differences from the default redis.conf):

  1. daemonize yes
  2. bind 127.0.0.1
  3. loglevel notice
  4. logfile /var/log/redis/redis.log
  5. dir /var/lib/redis/
  6. maxmemory 67108864


With these settings redis runs as a daemon, listens only on localhost, keeps the logging to a minimum and uses sensible locations for the logs and database. I also set the maximum amount of memory it can use depending on the
available RAM on the server. Adjust as needed.

The 2.4.1 version I'm using is quite fast, probably due to the jemalloc memory allocator, but if you need even more performance you can switch from TCP sockets to unix sockets. Pay attention to the socket file's default permission (755) and either run the daemon under the same user as the client(s), or change the init script to do a chmod after redis is started. Sometimes in the near future you'll be able to set the permission from redis.conf with the unixsocketperm variable and the unix socket will be the preferred way to setup local caching instead of an optimization.

Enjoy your new cache and remember that if you ever need a powerful NoSQL database, redis is there waiting for you.


Category: caching

Leave a Comment

Django redirects in reverse proxy setups

Starting with django-1.3.1 you'll get redirect failures for reverse proxy setups (cherokee and cherrypy in my case). The reason is that relative paths that you feed to HttpResponseRedirect are converted into absolute URLs using META['HTTP_HOST'] which is actually the IP of the reverse proxy. To fix this add the following line to settings.py:

  1. USE_X_FORWARDED_HOST = True

More details in the release notes.


Category: Django

1 Comment

automating remote commands over SSH with paramiko

I don't know how high ranking python is for automating system administration tasks but when I had to script remote ssh commands, I thought I'd give python a try. The main SSH library is paramiko and, while lacking in the documentation department, it's rather rich in features.

Besides the usual root login, I had a more difficult use case: using a regular user with full sudo rights instead of the superuser. These are the functions I came up with:

  1. import paramiko
  2. import getpass

  3. def ssh_connection(user, host, port=22, password=None, key_filename=None):
  4. """
  5. with password='' you will be prompted for a password when the script runs
  6. """
  7. ssh = paramiko.SSHClient()
  8. ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
  9. if password == '':
  10. # ask for one on stdin
  11. password = getpass.getpass('Password for %s@%s: ' % (user, host))
  12. ssh.connect(host, port=port, username=user, password=password, key_filename=key_filename)
  13. # custom attributes
  14. ssh.user = user
  15. if user == 'root':
  16. ssh.homedir = '/root'
  17. else:
  18. ssh.homedir = '/home/%s' % user
  19. ssh.password = password
  20. ssh.use_sudo = False
  21. return ssh

  22. def run_remote(ssh, cmd, check_exit_status=True, verbose=True):
  23. chan = ssh.get_transport().open_session()
  24. stdin = chan.makefile('wb')
  25. stdout = chan.makefile('rb')
  26. stderr = chan.makefile_stderr('rb')
  27. processed_cmd = cmd
  28. if ssh.use_sudo:
  29. processed_cmd = 'sudo -S bash -c "%s"' % cmd.replace('"', '\\"')
  30. chan.exec_command(processed_cmd)
  31. if stdout.channel.closed is False: # If stdout is still open then sudo is asking us for a password
  32. stdin.write('%s\n' % ssh.password)
  33. stdin.flush()
  34. result = {
  35. 'stdout': [],
  36. 'stderr': [],
  37. }
  38. exit_status = chan.recv_exit_status()
  39. result['exit_status'] = exit_status
  40. def print_output():
  41. for line in stdout:
  42. result['stdout'].append(line)
  43. print line,
  44. for line in stderr:
  45. result['stderr'].append(line)
  46. print line,
  47. if check_exit_status and exit_status != 0:
  48. print_output()
  49. print 'non-zero exit status (%d) when running "%s"' % (exit_status, cmd)
  50. exit(exit_status)
  51. if verbose:
  52. print processed_cmd
  53. print_output()
  54. return result

Notice the cool multiline command string that works even with sudo. We need to pass multiple commands because paramiko opens a new session for each exec_command() invocation so the current directory is reset on subsequent calls. Let's see some examples:
  1. # set up a ssh connection for root and ask for the password interactively
  2. myconn = ssh_connection('root', 'example.com', password='')
  3. # run some commands with the default settings
  4. run_remote(myconn,
  5. """
  6. pwd
  7. cd /var/log
  8. pwd
  9. ls -la
  10. """)

  11. # set up a connection for a regular user with full sudo rights and use it for running commands with root privileges
  12. sudoconn = ssh_connection('jim', 'example.com', password='imanewbieandileavepasswordsintextfiles')
  13. sudoconn.use_sudo = True
  14. run_remote(sudoconn,
  15. """
  16. whoami
  17. pwd
  18. cd /var/log
  19. pwd
  20. ls -la
  21. """)

You can suppress the exit code checking and the verbosity if you want and handle that info from the returned dictionary any way it suits you.

Bottom line, it might not be easy to use paramiko's API directly but it's trivial to use my functions so go ahead and script those repetitive administration tasks. As always when doing stuff as root, try not to hose the server ;-)


Category: Python

4 Comments
Page generated in: 0.79s