Justifying Text

In this post I briefly describe how good text justification is done using dynamic programming. In the future, I may post code for this problem.

Start by writing a ‘badness’ function that produces a value corresponding to how bad it is to have a certain set of words forming a line.

LaTeX uses the following formula for badness

badness(i, j) = (page_width - total_width)³    if [i, j] fits
                ∞                              otherwise

where
  i is the index of the first word of the line
  j is the index of the last word of the line
  page_width is the width of the document page
  total_width is the width occupied by the letters of the words [i, j]

Then, by using dynamic programming techniques, you need to find the line combination that gives the minimum badness sum.

Recovering Deleted Files

This post is specific to ext3 and ext4 partitions.

Yesterday I accidentally invoked bunzip2 without the --keep flag. This made me lose the original, compressed file. It would be complicated to get it back, therefore I used to extundelete to recover it. This post gives an example of the usage of extundelete.

First of all, I needed to have the partition mount as read-only, but I was using it to run my operating system, so I had to reboot and start from a live image.

Once I had it running, getting extundelete was easy

sudo dnf install extundelete

After that, I simply invoked

sudo extundelete --recover-file <path-to-file> <device>

In my case,

sudo extundelete --recover-file mg/wiki/dump.bz2 /dev/sda7

Finally, I copied the recovered file (found on a RECOVERED_FILES directory) to its final destination with cp and was done with it.

Hopefully this will be helpful to you someday.

Infection

Infection is a sandbox contagion simulator I recently started working on. It is an entertaining game that anyone with a basic understanding of JavaScript should be able to hack.

A simple set of rules and values govern how an infection spreads in a square board of 400 tiles. In the future, different challenges should be made available.

It is hosted at GitHub and released under the MIT license.

You can find an online version of the project here.

Python: Use Generators to Save Memory

Even though this feature is not new at all, very few seem to use generators as much as they should. Generators may help one write programs that require less memory to run when compared to their list comprehension analogous.

Let me use a trivial example to demonstrate the difference. Real-world examples may involve more complicated number crunching but will have a very similar structure and also present the same improvements.

Problem: determine the sum of the first ten thousand perfect squares

Solution using list comprehension

sum([n ** n for n in range(10 ** 4)])

Solution using a generator

sum((n ** n for n in range(10 ** 4)))

In this case, refactoring is very simple, but usually you would have the list comprehension result assigned to a variable and used later on. It’s up to the developer to spot scenarios where a list comprehension may be replaced by a generator.

The list, in its entirety, takes 79.28 MiB on my machine, whilst the bigger value produced by the generator takes only 17.33 KiB. As you would expect, the results are identical.

Determining the size of a list in memory

If you don’t know about it yet, there is a nice tool called pympler that allows you to measure the memory size of collections. See the snippet below.

from pympler.asizeof import asizeof


print(asizeof(list_of_perfect_squares))  # 84 134 440 -> 79.28 MiB

Photo: an Image Manipulation Library

This is an announcement of a Java image manipulation library I created.

It is released under BSD 2-Clause and can be found here. The main reason for creating it was that there was not an open source Directional Cubic Convolution Interpolation implementation out there in Java.

According to Wikipedia,

An article from 2013 compared the four algorithms above, and found that DCCI had the best scores in PSNR and SSIM on a series of test images.

There is also a dedicated page for this library here.

A Maven artifact is on its way. If you have any suggestions or requests, send me an email or create a new issue.