Data Should Be Preserved

I believe that data should never be lost. Specially data that are costly to make and have memory value. Texts, drawings, photographs, songs, programs, and videos should always be preserved, even if you are not interested on them anymore. They may live in a compressed file inside that old 512 GB HDD you hardly ever mount, but you have no reason to erase them. In fact, I think you should store them more safely than that, as there are several reasons to preserve them. Time flies, and as you see yourself aging you may want to remember how life was when you were younger. Not just by one yellowed picture. But by what you created.

What you produce in your limited lifetime is your legacy. All of it. It tells about you. It describes you in different stages of your life and should be safely stored for remembering moments that - no matter how sad or how happy - made up your existence. This post is to highlight the importance of writing, drawing, photographing, composing, creating, and, ultimately, preserving.

Not content just for the sake of it, but your life, for the importance of it. Even if you don’t care, maybe your family and close relatives one day will. Maybe historians years after from now will. Maybe another species someday will.

Lastly, I would ask you to keep as much as you find convenient of this data public. Share it, present it, distribute it, publish it, index it, so you can live for much more than your organic survival and let what you created during your lifetime contribute to the lives of others.

Boolean Expression from Truth Table

Producing a truth table from a boolean expression is a trivial task, as you just need to evaluate the expression for each combination of inputs.

However, the opposite is not so simple.

A way to convert a truth table into a boolean expression is writing an expression that evaluates to true for each and every row that evaluates to true and chain them together with the OR operator. This expression may later be simplified by using logical identities.

Note that this does not give you the simplest possible boolean expression, which is an NP-complete problem.

Example

Let us obtain a boolean expression from the following truth table

A B C =
0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 1

Step 1 - Write an expression for each row that evaluates to true

¬A  AND  B  AND  C
 A  AND ¬B  AND  C
 A  AND  B  AND ¬C
 A  AND  B  AND  C

Step 2 - Join them all with the OR operator

¬A  AND  B  AND  C  OR
 A  AND ¬B  AND  C  OR
 A  AND  B  AND ¬C  OR
 A  AND  B  AND  C

Step 3 - Simplify to the best of your ability

¬A  AND  B  AND  C  OR
 A  AND ¬B  AND  C  OR
 A  AND  B  AND ¬C  OR
 A  AND  B  AND  C

-----------------------

¬A  AND  B  AND  C  OR
¬B  AND  A  AND  C  OR
¬C  AND  A  AND  B  OR
 A  AND  B  AND  C

-----------------------

(A AND B) OR (A AND C) OR (B AND C)

Your efficiency on step 3 will depend on several factors such as your persistence, previous practice, and pattern recognition skills.

Optimizing Java Regular Expressions

I have been working with regular expressions a lot lately and thought I should share some ideas on how to make regular expressions faster in Java. Most of this ideas are quite simple yet may give you very good improvements in practice.

Low hanging fruits

Compile Patterns you use frequently.

Use small capture groups to prevent long backtracking.

Whenever possible, use non-capturing groups. (?:P) instead of (P).

Reluctant, greedy, and possessive quantifiers

Usually, reluctant quantifiers will run faster than greedy quantifiers. I am talking about +? and *? being faster than + and *. When the engine finds a greedy quantifier, it will go for the biggest possible match and start backtracking. In big strings this is usually extremely slow if you are trying to match a small substring (which is often the case), so going for reluctant quantifiers - that “grow” the match - instead of greedy ones - that “shrink” the substring they are trying to match - is very likely to improve performance.

Possessive quantifiers improve regular expression performance a lot. Therefore, you should use them whenver you can. P?+, P*+, P++ will match the pattern as any greedy quantifier would, but will never backtrack once the pattern has been matched, even if refusing to do so means that whole expression will not match.

NetHogs

This is a very short post to show one of my favorite system monitoring tools: NetHogs. If you are on a Linux box, it should be fairly easy to get it no matter what is your Linux distribution.

Once you have it, just run it as root and you will have a minimalistic, readable, and very useful list of which processes are consuming your bandwidth and how much of it each process is using.

You can also run it with the -v1 flag to get the sum of KB instead of the KB/s rate of each process.

License Notes Should Have a Date

When working on dungeon license notes I came to notice that we are in 2016 and this means that all files that I change right now require their license notes to be updated.

I wondered if I actually needed that, so I went to Programmers Stack Exchange and asked about it. I got quite helpful answers and comments within a few hours. Essentially, it boils down to

  1. Copyright expires, therefore you must provide a license date.
  2. The format Copyright (C) [years] [name] should not be changed.

Therefore, I must set up Git Hooks that will update all license notes of the files with a list of years in which the file has been modified.

The solution will likely be open-sourced so it benefits a bigger group of people.