sábado, 27 de noviembre de 2010

Testing raidpycovery through mdadm


I'm still working on polishing raidpycovery. I started doing real experiments today to recover data from broken RAID5s. In order to do it, I used linux' md. It allowed me to study md a little bit while testing raidpycovery.

So, let's get our hands dirty.

First, let's create a directory where we will work, for example:

$ mkdir raid5
$ cd raid5

Now let's create the separate images where we will create our RAID5. I will use 4 disks 10 MBs each:

$ for i in 0 1 2 3; do dd if=/dev/zero of=disk$i bs=1M count=10; done
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0584047 s, 180 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.100258 s, 105 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0691083 s, 152 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0508324 s, 206 MB/s

Now we have four empty files that we will feed into md to create our RAID to do our tests.

First, we will "loop" them so that we can use then with md (I don't know if this is really needed, but I'll do it just in case). As I'm working on a live USB, I'll have to link from /dev/loop2 and on. Check the used loop devices with losetup -a (as root... or with sudo):

$ sudo losetup -a
/dev/loop0: [0811]:31 (/cdrom/casper/filesystem.squashfs)
/dev/loop1: [0811]:38 (/casper-rw-backing/casper-rw)

Now, I loop the files:
$ sudo losetup /dev/loop2 disk0
$ sudo losetup /dev/loop3 disk1
$ sudo losetup /dev/loop4 disk2
$ sudo losetup /dev/loop5 disk3
$ sudo losetup -a
/dev/loop0: [0811]:31 (/cdrom/casper/filesystem.squashfs)
/dev/loop1: [0811]:38 (/casper-rw-backing/casper-rw)
/dev/loop2: [000f]:27898 (/home/ubuntu/raid5/raidpycovery/bin/raid5/disk0)
/dev/loop3: [000f]:27899 (/home/ubuntu/raid5/raidpycovery/bin/raid5/disk1)
/dev/loop4: [000f]:27900 (/home/ubuntu/raid5/raidpycovery/bin/raid5/disk2)
/dev/loop5: [000f]:27901 (/home/ubuntu/raid5/raidpycovery/bin/raid5/disk3)

Great. Now we can use the loop devices to create our RAID device:

$sudo mdadm --create /dev/md0 -l 5 -p ls -n 4 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5
mdadm: array /dev/md0 started.

If you read the man page of mdadm you will see that by default the algorithm for the RAID will be left sync and a default stripe/chunk size of 64k. We will need that data later on.

Now, let's format it so we can use it as a normal partition:

$ sudo mkfs.ext3 /dev/md0
mke2fs 1.41.11 (14-Mar-2010)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=64 blocks, Stripe width=192 blocks
7648 inodes, 30528 blocks
1526 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=31457280
4 block groups
8192 blocks per group, 8192 fragments per group
1912 inodes per group
Superblock backups stored on blocks:
8193, 24577

Writing inode tables: done
Creating journal (1024 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 23 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.

Now, we can mount our just formated RAID device:
$ sudo mount /dev/md0 /mnt/tmp
$ sudo mount
/dev/md0 on /mnt/tmp type ext3 (rw)
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/md0 29557 1400 26631 5% /mnt/tmp

There you go. A new partition with around 25 MBs of data available for mortal users. Now, let's copy some files into that partition:

$ sudo cp blah blah blah /mnt/tmp/

I'm using sudo to do the copy because right now that directory is owned by root.

After I copy the files I wanted, let's check their MD5s:

$ md5sum /mnt/tmp/*
a27ebcacc64644dba00936abc758486e /mnt/tmp/IMSLP32718-PMLP01458-Beethoven_Sonaten_Piano_Band1_Peters_9452_14_Op27_No2_1200dpi.pdf
e68fabdcda296ef4a76d834a11a6f1df /mnt/tmp/IMSLP44764-PMLP48640-Mahler-Sym9.TimpPerc.pdf
md5sum: /mnt/tmp/lost+found: Permission denied
d7bfe06473430aad5ca0025598111556 /mnt/tmp/putty.log
670536c55ae9c77b04c85f98459c0cd8 /mnt/tmp/Resume Edmundo Carmona.pdf
8727e8ff88739feca15eb82b4d9cb09b /mnt/tmp/Titulo Ingenieria.png

Now, let's umount our RAID and stop it:

$ sudo umount /mnt/tmp
$ sudo mdadm --stop /dev/md0$ sudo umount /mnt/tmp/
mdadm: stopped /dev/md0

Great... now, let's test try to rebuild the RAID with the raidpycovery tools. I don't have the tools at the same directory, so I'll have to move and use relative names for the disks, keep that in mind:

$ ./Raid5Recovery.py 4 left async 65536 raid5/disk0 raid5/disk1 raid5/disk2 raid5/disk3 > wholedisk
Number of disks: 4
Algorithm: Left Asynchronous
Chunk size: 65536 bytes
Skip 0 bytes from the begining of the files
Finished! Output size: 31457280 bytes

Now, let's mount it and see if we can get any data from the recovered RAID:

$ sudo mount -o loop,ro wholedisk /mnt/tmp
mount: unknown filesystem type 'linux_raid_member'

Not good. From my tests, that's because there's md garbage at the end of the disks that the mount is seeing. I'll force the partition type then:

$ sudo mount -t ext3 -o loop,ro wholedisk /mnt/tmp
mount: wrong fs type, bad option, bad superblock on /dev/loop6,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

Gotcha! Did you notice that I resembled the RAID using left async? It has to be left sync, remember? Let's try again:

$ ./Raid5Recovery.py 4 left sync 65536 raid5/disk0 raid5/disk1 raid5/disk2 raid5/disk3 > wholedisk
Number of disks: 4
Algorithm: Left Synchronous
Chunk size: 65536 bytes
Skip 0 bytes from the begining of the files
Finished! Output size: 31457280 bytes
$ sudo mount -t ext3 -o loop,ro wholedisk /mnt/tmp

No complains. Great! Now let's see if we can see the data in the recovered RAID:

$ md5sum /mnt/tmp/*
a27ebcacc64644dba00936abc758486e /mnt/tmp/IMSLP32718-PMLP01458-Beethoven_Sonaten_Piano_Band1_Peters_9452_14_Op27_No2_1200dpi.pdf
e68fabdcda296ef4a76d834a11a6f1df /mnt/tmp/IMSLP44764-PMLP48640-Mahler-Sym9.TimpPerc.pdf
md5sum: /mnt/tmp/lost+found: Permission denied
d7bfe06473430aad5ca0025598111556 /mnt/tmp/putty.log
670536c55ae9c77b04c85f98459c0cd8 /mnt/tmp/Resume Edmundo Carmona.pdf
8727e8ff88739feca15eb82b4d9cb09b /mnt/tmp/Titulo Ingenieria.png

And there you are. Everything is right there. Using this method today I discovered I was not rebuilding chunks from missing images correctly and that one of the right-handed algorithms has to be corrected. Let's see what I find.

As homework, try running the recovery script providing a missing disk (write none for it) and see if it works.

Hope you liked the read or find it useful.

Update: Have corrected the right algorithm issue. Make sure you get the latest stable release if you are gonna use it:

bzr branch -r tag:2.00.02 lp:~eantoranz/+junk/raidpycovery

domingo, 21 de noviembre de 2010

Broken RAID5 you said? Don't use Java anymore. Go with python instead!

So, you have a broken RAID5 and found the article I wrote for the Free Software Magazine 5 years ago about it and would like to give it a shot.

First off: Good luck! You are gonna need it.

Second: When I solved that problem I didn't have any experience in python whatsoever plus Oracle hadn't bought Sun, sued Google, etc.

During the course of these 5 years I have received some requests to help people out with their situation and it's been a very nice experience to hack on those devices here or there to get the data back. I have received a little money by paypal and even once got a box of whine bottle from Italy (thanks, Marco!) as payments.

Every single time I've had to help someone with it, I always get the question of how to run the .java classes I made. And I even took the task of compiling the .class files to then send them to the person in need.

Given the gripes over oracle/java recently I have decided to translate the library to python for people who want to use it so they don't have to go around trying to learn how to compile/run things in Java (at least in GNU/Linux).

The project is here:


You can get it very easily by using bazaar:

bzr branch -r tag:2.00.00 lp:~eantoranz/+junk/raidpycovery

Or you can just download the files from that same version and use them:
Let's not forget the readme where you get to see how to use it:

There's more stuff in the project (some chunks from the RAID I recovered at the Hospital, mostly... but you never know!) so I'd suggest to use bazaar but....

Warning/disclaimer/whatever: I haven't tested it on a real environment so there might be some glitches (but I did a translation mostly so it should be correct overall).

In case you want to get the original java code:

bzr branch -r tag:1.00.00 lp:~eantoranz/+junk/raidpycovery

Or start browsing here to get the files.

PS Sorry, Larry.... it just had to be done. A little weight off my jumping-off-java shoulders. :-)

PS2: Donations, donations! You found the project useful? How about giving a little contribution? 0,10 US$ / Gb recovered? Sounds like fair to me. Specially considering how fast HDs sizes are growing compared against the inflation.

lunes, 15 de noviembre de 2010

My thoughts on the switch to wayland

Like anybody cares for what I think, right? Anyway... I made a comment in one of these wayland-related news at linuxtoday and Carla Shroder took the time to ask me:

"...why all that extra complexity to go back to where we were in the first place?"

That's a fair question. By the way, my love to you, Carla. You haven't sent me a comment about pythogoras that I asked you for but I still love you and care for you. :-D

Anyways... the thing is this:

Wayland is going to replace X in Ubuntu and Fedora. That's quite a remarkable statement to make. We are talking about X, the same X that has been there since I started using GNU/Linux about 9 years ago and even many years before that cosmic event, if you will.

But all the GUI applications for UNIX at the time use X, so that means it's going to be a really troublesome change, isn't it? Also, there are really cool and extremely useful features in X like Network Transparency (think of export DISPLAY=blablah:0.0 or ssh -X, pleople) that would disappear from the face of the earth (I'm wondering what I would show my Windows-loving friends now when I start showing them the wonders of GNU/Linux if I don't have network transparency. I'll have to think about that).

As I was saying, I'll (try to) tackle those two questions.

First off, this means a major reworking to get all applications to render on wayland, isn't it? I think it can be solved by hacking the lower layer APIs like Qt or GTK or even at a lower level like wrapping the Wayland API for clients around the X API (update: I think it's viceversa... but you get the concept, don't you?). Then not much work would have to be done on the higher layers (or so the fairy tale theories of software development say). That means applications won't be hurt that much. I'm not implying that it's going to be easy but not much work will have to be required from us mere mortals in that case. Also it means that the switch could be done overnight to start using Wayland.

Then there are things built inside X that would be gone once Wayland makes its debut, right (to be read as "Network Transparency")? What I've read around (won't provide any links but it's very logical) is that X could still use Wayland as just another graphics interface and then you will have X running on top of Wayland and (tadaaaaaaaaaaaaaaaaa) you would get Network Transparency back faster than you can say "mi moto alpina derrapante" (that's a nice joke in spanish... people who are learning spanish should give me a call so that I tell you about it and have a laugh).

But then Carla makes the final killing point: Is all this hassle really worth it?

In all honesty, I don't know. It's still too early in the game to make up my mind weather it will be worth it or not. But if X's complexity/overhead can be cut down and get a faster/lighter/snappier/sexier/whatever-you-consider-important-er environment to spend your time onto, then it could be worth it in the end... specially if (as I said) it won't require much work but on the lower layer APIs.

Now I don't know if switching to Wayland will be really worth it, but was it reading this article worth it after all? I truly hope so.

miércoles, 10 de noviembre de 2010

Why don't manufactures get together and get Microsoft out of the loop?

Here comes a crazy thought:

Why don't computer manufactures get in cahoots and get Microsoft out of the demi-god loop? I mean, computer manufactures are struggling because their profit out of every sell are very small... in the meantime, Microsoft is laughing its ass off while making millions upon millions over their sells of Windows 7 (or so they say).

If any of the manufacturers dared getting out of the Microsoft way (you know, let's push Linux a bit harder, let's not say we recommend Windows 7, etc, etc), Microsoft would just need to get the price of Windows 7 licenses a little higher to force the manufacturer into red numbers (if it isn't already, of course) and teach the bully a lesson, right? We have all seen the Halloween Documents and know of Microsoft's business practices, so no surprises there. Microsoft would still have all the other manufactures lined up with lower prices to pay for Windows 7, the rebel manufacturer would have to get their prices a little higher because of the increase in the price of Windows 7 it has to pay now making it more difficult for them to compete against the other manufacturers and Microsoft gets to keep the upper hand in the end. Sounds like a subtle kind of trust, doesn't it? By the way, am I the only one who has the phrase "Divide and conquer" pounding in his head?

But what would happen if the biggest manufactures would get together (say, in a secret deal) and fight Microsoft all at the same time? Is that possible? Is that legal? Would that be enough to get Microsoft out of its ways and start really competing instead of been shoved down everybody's throat? If it is legal (I honestly don't know), then why don't they do it and let Microsoft laugh (very loudly, by the way) at them (and the rest of us)?

Now... where did that smart-ass laugh I'm hearing in surround sound come from? Seattle?

sábado, 2 de octubre de 2010

Introducing Pythogoras.... or what is musical tuning?

Music is everywhere.

A major part of culture is driven by it. Dancing without music? Movies without music? Going to a restaurant and having no background music? You get the idea, right? It's, just like John Milton said in "The Devil's Advocate", EVERYWHERE!

In terms of western music (and that's what I intend to talk about in the article, keep that in mind), a loooot of music theory has been created and published for many centuries. Yet much of it based on experience.

And there's a major part of it where I believe a lot of research is yet to be made: Tuning. Have you ever heard someone play with an instrument that's "out of tune"? But then, this begs the question: What is "Tuning"? Well... the answer can be anything but simple. Go take a look at wikipedia to make yourself an idea.

The fact is that for us humans, or rather, our human ears, being in tune will mean using the simplest ratios possible for notes/chords. But that is in direct contrast with tuning instruments... specially keyboards..... let me explain myself a little bit better:

I play the flute and it's very simple to assume that whenever you play, say, an A, you will always get the same frequency that you used to tune the flute (be it 440, 441, whatever)... but it's not like that. The tuning of the flute can be adjusted a little bit "on the fly" by readjusting the angle of the wind that is coming out of your mouth towards the flute or by _turning_ the flute either forward or backward (backward makes the pitch a little bit lower). Even more, the flute will play the forte sounds sharper than piano ones and just in case this is not enough, higher notes will also produce higher pitches for each note... in other words, a flute player is _always_ tuning his instrument according to what's being played. Most instruments, like the flute, can be tuned on the fly: Violins, Violas, Cellos, Basses, Oboes, Clarinets, etc etc. Most of the instruments in the orchestra can do that... most notorious exception: The Piano. It's tuned by a "tuner" and will remain like that for years to come, no on the fly adjustments.

The thing is that there are many ways to tune notes, in other words, many Tuning Systems are available and each has advantages and disadvantages. Normally there are trade-offs between "tuning experience" and "practicality of tuning the instrument to different tonalities".

I believe a lot of research has to be done on the way we perceive being in tune, but where to start? Well, I'm a programmer (among other things) so I developed one application to _try_ to play music with different tuning systems (I'm sorry for the long introduction).

It's still very rough but I hope I can make it better in the following months (sponsoring is welcome, by the way).

So, instead of describing what it can or can't do, I want you to listen to three files I produced from the application. Here you have the Air on the G String by J. S. Bach played in three different tuning systems:
- Just
- Pythagorean
- Equal Temperament

Files are available both in MP3 format and OGG (which I prefer). If you'd like to get more info, don't hesitate to email me.

Oh, and by the way, the application is called "Pythogoras" and can be found here. It's released under the terms of the Affero GPLv3... and then again, keep in mind it's still veeeery rough.

martes, 24 de agosto de 2010

Who dares using Windows in a critical environment? (updated)

I just can't believe what my eyes just saw. You are telling me that a commercial plane (a real plane, the ones that take off from land at hundred of kilometers per hour carrying hundreds of people in them) had Windows running on one of its central processing nodes? You gotta be kidding me! Who dares doing something like that? Come on, people! We're talking about a life-or-death situation here, not the normal pop up that at its worst will nag me to hell asking me to date someone on the other side of the planet.

This begs the questions: Will there be consequences for the people involved in taking the decision to put Windows on board? Maybe consequences for Microsoft? Will there be prosecution for the guys who developed the malware for (unintended?) murder? Nice things to talk about in the following months, I guess.

Absolutely terrible.

From what I've read about it, Windows is not used on the plane but on computers used to get information from the plane. Unfortunately it's still part of the causes that generated the accident which had an outcome of several lives lost. Why was Windows used in the first place?

sábado, 21 de agosto de 2010

So the patent mess started from Java and not .NET.. what an irony!

You know how many times I complained about .NET for being backed up by Microsoft and so I distrusted it even if they had made pledges (under certain conditions) not to sue and having made it into an ECMA standard (portions of it)?

Not that I had put all my developing faith into Java, anyway, I've worked with PHP and Python during the last couple of years but it's certainly an irony seeing the patent mess to come from Java instead.

Will this mean there will be less development to Java? Will this push development to other languages like Python? Perhaps Go? I'm just wondering?

I guess ORACLE will lose a lot of FLOSS support after this. Will we start seeing people from ORACLE jump ship the way some did (like Samba's Jeremy Allison) when Novell signed off their agreement with Microsoft back in 2006?

sábado, 14 de agosto de 2010

I finally know when ARM netbooks will come out


Before I tell you when they will come out, let me tell you how I figured it out.

I'm used to having to face disappointments.... disappointment after disappointment. I want to have something? I just can't go out there seeking for it. No, no, no... that'd be the wrong approach. It'd never happen. I have to move slowly towards my goal. As one of my teachers told me in computer engineering classes when talking about the career: It's resistance race, not a speed one. Well, that's the approach I have to take with most things in life. And even then, when things happen the way I want, I worry that Murphy will somehow learn about how well things are going for me and mess up.

So, after waiting for over 1 year since I heard for the first time that ARM-based netbooks would come out, I'm still waiting... and I'm fed up of waiting already so I'm ready to give up on them and I'll end up buying an Atom-based one. Disappointment, see?

And so, when will they come out? A couple of days after I buy my netbook.... as it was supposed to be since day one. I better hurry and buy it so that they come out cause otherwise they will never come out.

sábado, 31 de julio de 2010

RMS is liberal, not a libertarian


I was reading the latest responses by RMS to 25 very important questions he was asked recently. It's been linked from the two IT news site I read most often (OSnews, LinuxToday), so it was like unavoidable to read it.

While reading, I found a rather interesting statement by RMS: That seems to describe the viewpoint called "laissez-faire" or "Libertarian". Where business is concerned, I disagree with it very throughly, because I'm a Liberal, not a Libertarian.

I'm venezuelan, you see? And the things that are going on in Venezuela are despicable. I just can't understand how so may people can still follow a government so bad after 11 years just because 20 years ago a guy showed up on TV in a 30-second clip to say that he was guilty for a failed coup d'état (whereas all the other cells of their movement were successful in getting their tasks done...in other words, the only one who actually failed!). I don't support coup d'états, let's make it clear.

Venezuela's government has become a theocracy lately. Who's the god? Chávez? Not at all. The god in Venezuela is called 'Socialism'
(old-fashioned very far left-wing socialism.. the one where the leader has to be protected from criticisms of all kinds and so on ). The government is certain that socialism is going to solve each and every problem faced by Venezuelans. But when will that happen? There's not such certainty on this point. It's such a shame to see this high-level venezuelan politicians say how socialism is the only way to solve x or y problem... it's offensive, really.

Sorry for the break and going back to RMS: In Venezuela the government pushed for FLOSS (and I'm thankful for that). And there's people (government followers) that believe that FLOSS is somehow related to 'Socialism'. I bet they'd go as far as to present RMS as one of the nowadays socialist heroes and the Free Software Movement as the the equivalent of Khmer Rough for IT. I'd also bet more than one guy out there is more than willing to paint RMS together with Fidel, Che and Marx (well, they have the guts to have Simón Bolívar and Jesus Christ in that crowd having the last supper, so no surprises) or RMS holding a rifle in his hands (to defend Revolution), just like it's been done with Jesus Christ in walls in Caracas. I just can't see how something where there's no central control (as FLOSS is) can be somehow related to Socialism (far left-wing, as I said). Perhaps the ways of the community that's associated to FLOSS could be somehow related to the ideal socialist community's behaviour, but Free Software per se? So it's always reassuring when RMS defines himself as a liberal. No more doubts about that anymore.

Perhaps Free Software Development model could be related to examples of almost pure liberal environments and used in scientific/social investigation. I think some investigations have already been made, right?

PS I don't have anything against socialism per se. It's a political philosophy and there's nothing wrong (if you ask me) with applying some of its principles in public policy (the ones that have been tried and have had a good result in different places). But when it's used as the excuse to apply bad public policies that have been tried in the past (and failed miserably... both in the past and in their actual instance), that won't leave a good trail for it in the future.

martes, 20 de julio de 2010

How to (easily) fool your host into thinking a name is mapped to a certain IP address


Very recently I've been involved in setting up a site that uses Google Maps API. When you want to use the API you have to create a key that is generated according to the name of the site but, as of now, the DNSs of the name we want to use are not pointing to our hosting. I could use the IP address of the host where the site is, right? Well, no. It's a shared hosting so requests have to be made by name.

In this case I have to fool my host into thinking that the name we want to use is mapped to an IP without going through public DNS resolution. I could set up a DNS service just to serve this name and forward everything else but it sounds like an overkill, doesn't it?

In GNU/Linux (and I'd dare to say in any POSIX compliant OS) there's a simpler way to do this:


In this file we can map names to IP addresses at will. So this thing I want to do could be performed by adding a single line to my /etc/hosts (not a real example... IP address and name are bogus): mydomain.com

Then I could go to http://mydomain.com and I'd be able to see the site that uses Google Maps, even if the name is not "officially" public.

Notice this trick (surprise, surprise) also works in Windows (which is POSIX compliant, right?), only the file that has to be edited is System32\drivers\etc\hosts

Additional information:
man hosts
man nsswitch.conf

viernes, 16 de julio de 2010

Is moving to IPv6 all that important, really?

I'm reading this article where we are told how important it is moving to IPv6 (and I'm not saying it isn't), but I can't help but wonder... will enterprises (for example) really have to move to IPv6? Let's suppose you work in a not so large enterprise where basically the only internet need is to reach internet as a client (which I guess would be most of the needs of a large portion of internet users). Does it make sense to move the whole infrastructure to IPv6? I don't think so. With having the proxy/nat server set up with IPv6 to gain internet access, the inner networking for said enterprise could remain using IPv4, or isn't it?

jueves, 15 de julio de 2010

Well over half of the most reliable hosting companies run on Linux


I like to follow statistics on market share of browsers, os, web server and so on. They have to be taken with a big grain of salt for sure but they do give us a look at trends more or less accurate (or so I think).

One of the most interesting statistics I've been following are the ones presented by Netcraft regarding web server market share and OS used by hosting companies. The Web Server market share (Apache Vs IIS and so on) is the one you'll see more often talked about on the web. But the one about reliability is rarely (if ever) talked about... so I'll take a couple of lines to talk about latest statistics that correspond to June'10 puts it like this:

- Over two thirds (29 out of 42) of the most reliable hosting companies use Linux (would they use GNU along with it?)
- 14.2% use BSD (FreeBSD to be more precise)
- A little less than 10% use Windows
- 3 out of 42 are a big question mark

How about that? You think the numbers are accurate... or are they skewed?

PS Did you see the uptime chart for microsoft.com? It's kind of shameful, you know? No wonder it's used by less than 10% of the most reliable hosting providers. :-)

domingo, 27 de junio de 2010

Definition of XOR for multiple variables.... or my little contribution to theory

I was remembering one of my classes from computer engineering when I was just starting it at the university. The definition of the XOR operator.

One of the ways to define XOR is:

A ⊕ B = ~A . B + A . ~B

In other words, one and only one of the variables is true. Great... couldn't be simpler.

When you consider operators like and and or, you find that you can break these operations into smaller pieces when you are considering more than two variable... like this:

A + B + C = ( A + B ) + C = A + ( B + C )

A . B . C = ( A . B ) . C = A . ( B . C )

All is fine and dandy.... but then let's say we find a case of exclusive or where all the variables (more than two) have to be considered at the same time.

Say you are considering a definition where you have three variables where one and only a single one of them is true to consider the result to be true. If we try the same logic that was applied to and or or, you would end up with this:

A ⊕ B ⊕ C = A ⊕ ( B ⊕ C )

If none of them is true, it will be false. If only one of the variables is true, it will be true, if two of them are true, it will be false but (and here's the problem) if all of them are true, it will be true.... and according to the problem definition, it will be wrong cause one and only one of them had to be true for the result to be true in the end. In other words, all operators have to be considered at the same time to say whether the result is true or not.

For this case I think there should be a different operator called "MultiXOR" or something like that. The problem with this operator is that there's no easy way to break it into smaller pieces. Here's a shot using a recursive definition:

x1 m⊕ x2 m⊕ x3 ... m⊕ xn = x1 . ~( x2 + x3 ... + xn ) + ~x1 . ( x2 m⊕ x3 ... m⊕ xn )

⊕m being the MultiXOR operator.

There you have it.

domingo, 9 de mayo de 2010

GNU/Linux: Flexibility is the name of the game

Some things are meant to happen:

- We will all eventually die
- Windows gets filled with viruses over time (if we don't consider Windows a virus in itself, of course)
- Chavez will rather be talking crap on TV than taking care of doing a real government that will make people happier... or, at least, safer (in the sense of personal safety).

You get the idea... those are facts of nature. But there are others where the planets align in such a way that you are able to sense when they will happen a couple of seconds before they do happen. I had one of those last wednesday.

I was working on a project that had required that I left Bogotá to another city for a week... I was working at a client's installation and was completely isolated from internet. I remember a thought came to my mind: I need to do a backup of all the things I have of the project and send it to my coworkers. 10 seconds later, my laptop was heading to a crash with the floor. Oops! Not good.

The working day was already closing so I had sent the computer into hibernation but the fall happened before the computer shutdown. I tried to restart it (Windows.... but the GNU/Linux part is coming, don't worry... read on). It would fail to start because of a bad checksum of one of its dlls.

See, I'm forced to use Windows on my computer at work but I do must of my work on a Virtual Machine that I cheerfully use on VirtualBox. In the virtual machine is where I keep must of the information of the projects.

Thursday was my last day outside of Bogotá (I was traveling back on friday) so I had to finish all of the tasks I had been assigned to do in 24 hours... with my computer brain-dead, it doesn't look like I'm gonna get a congratulation at the end from my boss.

I always keep a LiveCD with me just in case things like this happen (on other people's computer, normally). I fire it up and the computer responds normally... so I'm able to work on it... even if I don't have all the information about the project. I check and see that the D partition (where the data is) is usable... at least I get to see many files I'm working with and md5sum them.

I head back to the hotel cause I gas leaving already and try to connect to the wireless... but I just can't (It's Kubuntu Jaunty, not lucid.... hope it's better with lucid now). I ask for permission to connect to the router through UTP which I'm granted so I head to the lobby and mails start going back and forth about the support questions I have regarding the project. One of my coworkers through mail tells me to backup the files... damn, why didn't I think of that? I had brought with me a pen drive that came with the hardware of the project so I copy many of its files but not all of them as they are inside a VirtualBox virtual hard drive (switches configurations in time... thanks to version control, plus many other things). I stay there until about 3 or 4 AM and head to bed to try to get some sleep.

8 AM... Here's when the real hacking begins. I start wpa_supplicant by hand to try to see what's going on with the wireless connection. I see a message about non-wpa networks not being allowed to connect through wpa_supplicant. I think we can try something different in this case: /etc/network/interfaces (on debian based distros). I edit it to include the wireless I'm trying to connnect to and its key... something like:

iface eth1 inet dhcp
    wireless-essid this is the wireless name

    wireless-key this is the key

Save it and try to connect: sudo ifup eth1

I get connected in a couple of seconds... oh, well.... :-S One less problem. Now the data of the project. I have the home of the virtual machine in partition D of the HD (as a matter of fact, I use the whole virtual HD as the home... there's no partition table... the joys of using GNU/Linux) so I need to be able to start VirtualBox so that I get the information of the project out of the virtual machine. I install openssh-server and VirtualBox from repos. I try to start VirtualBox with the home virtual HD and the LiveCD I'm using.... the boot process begins but VirtualBox dies because of memory issues (I have 1 Gg of RAM and running Kubuntu plus virtual box... I knew it easn't going to hold water). I download DSL and try to run the virtual machine with just 64 MBs of ram in "single" mode (dsl single on the boot menu).... and I'm up in a couple of minutes. I mount the home partition on /mnt inside DSL:

mount /dev/sda /mnt

See I didn't use a device name that includes a partition number? That's because, as I said, I'm using the hole HD as a partition instead of using a real partition of the HD. I tar the project and, as you should rememer, I had installed the openssh-server on the liveCD session, right? I sftp it out of the virtual machine inside the liveCD session... and now I'm able to forget about the HD cause I won't need it anymore. Put the project in the Pen Drive and the liveCD + pen drive becomes my new work environment and I move on to do my final work day on site (I did have time to finish all my tasks thanks to GNU/Linux... as usual). Get the congratulations from my boss and go to have a good nigh sleep which I'm already m,issing.

Conclusion: Flexibility is the name of the game... that's why (among other things) I enjoy using GNU/Linux.

PS for those of you wondering how I was able to install software on the liveCD session once I was on-site with no internet connection, should take a look at this article.

sábado, 1 de mayo de 2010

Panama is on the road to MS hell

If South Korea is an example of what Panama is attempting to do by following the Microsoft guide on how a country can jump into technological advancement, then things sound like it's going to be a bumpy road to get there.

Apparently Panama's President has signed a treaty with Microsoft to push technological development in the centralamerican country.

And what's the recipe?
- A computer for each and every one of the students of Panama (I'm wondering if they will get to use other things what were not Windows and other Microsoft stuff)
- You will be able to do all things government related on-line (wanna bet that it will be IE-only?)
- Plus other things.

Feels so nice to see that governments still see Microsoft as the cornerstone of technology (or even worse, sell out).... what a shame!

Too depressing, really to keep on writing about.

sábado, 10 de abril de 2010

How does GPL licensing affect projects that don't involve linking/compiling?

This is a question that I've been trying to figure out for a while already.

I'm working on a php project that I will distribute (or whatever word you want to use) under the terms of the Affero GPLv3.

Now I included a piece of code (PHP Simple HTML DOM Parser) that's not under that license (MIT kind of license, apparently) and I started wondering how GPL code gets affected by the code I included... I mean, it's already settled matter that you become a GPL sinner (so to speak) not if you choose to use GPL code mixed with other (proprietary, for example) code.... the sin happens at the moment of redistributing (or propagating or reciprocating or whatever it's called now) the binaries resulting from linking GPL code with code licensed/released under other terms.

But how is it handled when there's no linking? At least no linking to redistribute... think of projects made in PHP or Python, where no binaries are released.

jueves, 1 de abril de 2010

SCO sends patches to the linux kernel

In a move that will surely leave many in the FLOSS community shocked, SCO, the company that just three days ago was claiming that they and not Novell owned UNIX's copyright and that the Linux community were a bunch of free loaders, after getting their claims rejected in a court of law have decided, in a classical "in you can't beat them, join them" fashion, to start sending patches to the linux kernel.

Here's a sample taken from their first submission (from init/main.c):

< * Copyright (C) 1991, 1992 Linus Torvalds
> * Copyright (C) 2010 SCO
> *
> * This file is released by SCO under terms that forbid it to be part of any
> * project released under the GPL (as the GPL is famous for being inconstitutional)

Who would have believed it?

sábado, 27 de marzo de 2010

Networking is a little more than IPs and netmasks


Case one

Very recently I was asking this questions (which is still open) at www.linuxquestions.org (the first place I hit when I have a question regarding linux or gnu, by the way) and took a brief look at the questions open on the networking forum and I hit this beauty.

It's a guy who has set up DNAT on netfilter to forward packets that are sent to one host to another server that does the real work. Think of it as a proxy. In his example, he wanted to forward packets that arrive at his host on port 3306 to port 3197 on another host (let's use IP a.a.a.a). So, he set up a simple rule on (nat) PREROUTING:

$ iptables -t nat -A PREROUTING -p tcp --dport 3306 -j DNAT --to a.a.a.a:3197

What this rule is doing is telling the kernel to change the destination IP address of any packet that arrives at his host through any network interface to IP address a.a.a.a (reachable from his server, maybe not from the host that originated said packages) and the destination port to 3197 (the port where the real service is working on the real server). When the routing decision is made on those packages a while later the destination IP address will be a.a.a.a and so the packets are sent to the real server. Source address/port of those packages remains the same (unless a little more natting is done, of course). Nice and dandy.

Then, when the packets arrive at server a.a.a.a port 3197 the response will be sent to the originating host/port and the "networking cycle is complete". A word of caution: this works if the packets that are sent back from the real server go through the same host that is doing the natting. If the real server is sending the packets to the originating host through another host, the trick is broken as packets arriving from a.a.a.a:3197 to the originating host don't match the IP:port he sent traffic to, so the connection is not established. This can be solved by SNATting this same traffic on the server that does that DNAT before the traffic is sent to the real server (making sure traffic will come back through it on the way back).

He tests it and it's working. Traffic is reaching the real server and going back to clients.

He then tried to replicate that same behavior but using localhost instead. So he added a rule that looks very much the same on OUTPUT, like this:

$ iptables - t nat -A OUTPUT -p tcp --dport 3306 -i lo -j DNAT --to a.a.a.a:3197

It should make it, shouldn't it? Try telnet to localhost port 3306 and nothing happens. No connection is established. Doesn't work. But why? Using a sniffer it's seen that when the -t nat OUTPUT rule is not set, traffic to localhost port 3306 is moving through interface lo, nothing wrong with that, but when the rule is set up again, traffic gets lost. It doesn't go through lo or any other network address.... so the IP stack is descarding it. Weird. Counter for the -t nat OUTPUT rule is increasing so it's doing its job as required... still, no traffic.

So what's going on? Let's think of what's going on with the traffic. When it reaches -t nat OUTPUT, this packets have source address port whatever, destination address port 3306. Then, after the rule is applied, source address is port whatever, destination host is a.a.a.a port 3197. As the packet is changed in nat, a second routing decision is made on it. As the destination host is a.a.a.a, traffic should be sent to the real server but IP source address is If there were a DNAT rule on POSTROUTING, it should take care of this problem (there was a MASQUERADE rule in place, so no problem). The problem (which is a little buried in the networking stack of linux) is that by the time the second routing decision is made, the source IP address is not consistent. Let me show you my routing table:

$ ip route show dev eth0 proto kernel scope link src metric 1 dev eth0 scope link metric 1000
default via dev eth0 proto static

Nothing about Why is that? It's because this is set up at another routing table (linux supports multiple routing tables, in case you didn't know). You can see the routing tables available by taking a look at file /etc/iproute2/rt_tables. I have default, main and local. Let's take a look at them:

$ ip route show table default
$ ip route show table main dev eth0 proto kernel scope link src metric 1 dev eth0 scope link metric 1000
default via dev eth0 proto static
$ ip route show table local
broadcast dev lo proto kernel scope link src
broadcast dev eth0 proto kernel scope link src
local dev eth0 proto kernel scope host src
broadcast dev eth0 proto kernel scope link src
broadcast dev lo proto kernel scope link src
local dev lo proto kernel scope host src
local dev lo proto kernel scope host src

And this is where things start to make sense. If you see carefully, with src address, all routes there have a local scope, which means that they are not used outside of the scope of the actual host. In our case the dest address is a.a.a.a and with src address, it's impossible to route this traffic... so it gets dumped.

But it fails because we attempted on address, but if you tried to telnet to the ip address of your intranet address instead, the test would be successful (our traffic will go through interface lo, the kernel can figure that out, and so the filter will apply). The src address will be that same address and the DNAT will change dest address to a.a.a.a and the trick will work.

Hope you find this trick useful.

Case two

Think of a situation where you have two internet connections through two different ISPs. You get two ethernet cables from them, they provide you with two static addresses/netmasks/default gateways/dns etc.

You connect each cable to a different box, set up networking and everything works like a charm.

Now, you want to get a little wacky and connect those two cables to a single switch (layer two) and connect those two boxes to the switch as well.

Connections should work fine, right? And they do... but then, what happens if you try to send traffic between those two boxes? Say, from box A you ping box B. In this case box A checks it's routing table and realizes there's no network defined for such host so it goes through its gateway. ARP request to get the mac of its gateway, gateway responds with its mac address, packets go out with mac address of the gateway, src address A, dest addres B and the traffic is heading to internet through one ISP. Then traffic comes through the other ISP to box B, box B gets it. It's going to respond to host A, there's no route for it, sends it through its gateway, goes through same ISP that sent the request to host B, comes back through first ISP to host A and we see a reply on host A. Great.

But wasn't that too long a trip to reach a host that is two ethernet connections away from host A? There should be a way to make the trip shorter, right? And sure there is. You can set up routes to be reached through gateways (layer 3 routing) but also through devices (layer two routing). How does it work?

Let's add a layer two route for host B on host A:

ip route add b.b.b.b dev ethx

ethx being the interface we use to connect to switch. And that's it.

Now what happens when host A tries to ping B? Now, there's a route to reach B through interface ethx so an ARP request for IP b.b.b.b is sent through said interface. Traffic is sent to the switch. The switch broadcasts this ARP request and reaches B, B responds to the ARP request with its mac address. A learns Bs mac address and sends traffic to it. Source IP address is A's, dest address is Bs, dest mac address is Bs. B is able to see this traffic (it's got its mac as destination) and sees As ping request. Now to respond to A it checks its routing table. Remember you didn't change anything on B? Well, there's no route to A so have to go through gateway. Traffic is sent to gateway, ISPs and then it reaches A.

To get the trick working to avoid using ISPs at all, you have to do the same thing on B:

ip route add a.a.a.a dev ethx

ethx being the interface B uses to connect to switches.

After not writing for so long, I had to have something interesting to write about, right?

Have fun!

domingo, 14 de febrero de 2010

Space is being consumed too fast? Find where!

You've found hat some process (don't know which) is eating up too much space too fast in one of your partitions and don't know where? Just yesterday I found a simple way to figure it out.

Suppose you want to check the partition where you have /home (which is in a separate partition from /, right?) where space is been eaten up too fast.

Run this simple command:

find /home -mount -type f -exec ls -s {} ';' > list_of_files1.txt

After it finishes running, wait for a little while (30 seconds, a minute, your call) and run the same thing outputting to a different file:

find /home -mount -type f -exec ls -s {} ';' > list_of_files2.txt

To find out what's going on, run a simple diff (a tool present in I guess every GNU/Linux system... if not unix) between both files and you should be able to see what's going on.

diff list_of_files1.txt list_of_files2.txt

You're welcome!

jueves, 4 de febrero de 2010

ARM will fly without windows? Then bring it on!

I was reading yesterday this article of an interview to Warren East, one of the top guys at ARM.

He goes on about how ARM will succeed with or without Windows (not ME) supporting it once it starts being pumped into markets in the shape of a new architecture for netbooks.

I've being begging for this beauties to come out for months already. Last year I saw forecasts saying how thew would start selling for roughly 200US$ and coming out on the 3rd quarter 2009... then the 4th... we are already past the 1st month of the 1st quarter of 2010 and only prototypes are what I've seen. I'm already fed up with it.

So, Warren, please... instead of forecasting doom for Windows if they don't support ARM (and I hope both things do happen), tell me when the machines will finally be out, who will put them out and the prices. I've already had my fare share of predictions about ARM netbooks. I want to actually see them and buy them (just 5 of them for me... I want to see the faces of my brother and sister when they get theirs being unable to run Windows on them).

Thanks in advance!

sábado, 30 de enero de 2010

There's nothing wrong with being thankful (or why I say GNU/Linux)

When I was a little kid (not that I'm too big, anyway... 5'10/~150lbs or 1,75/74Ks.... whatever you get better) I was taught to be thankful for things that I get. There's nothing wrong with saying "thank you" when someone has fulfilled one's need/wish for something... even more if the person who fulfilled it was not in any way forced to do it for us.

In the FLOSS community there's this old argument about whether we should call the OS that people usually call "linux" as GNU/Linux or plain linux.

The ones who defend "plain linux" say it's out of simplicity, being more catchy, undoubtfully easier to pronounce than GNU (at least in english... in spanish we convert GN to Ñ... or at least, I do it), and a long etc.

But there are many sides to this story that, at least to me, don't add up.

For example, simplicity for newcomers: What about when you have a distro that doesn't have Linux inside of it? Say, Debian's hurd or kFreeBSD or NetBSD ports? Those are distros, but Linux is not to be found inside cause it (the kernel) has been replaced for another kernel. What are we gonna call them? Debian Non-Linux? Go figure how you will explain that to newcomers ("sure... it's Debian Linux... but it has no linux... yet it is linux". Priceless).

Some people have said that it's out of Stallman's big ego that he wants everybody to call it GNU. Well, I think Stallman hits the nail (at least on the funny part) when he says that "sure... and that's why I ask people to call it Stallmanix". So I think it's not out of ego... but maybe if he had named the OS Stallmanix in the first place, we wouldn't be having this argument as it (too) is more catchy than GNU. :-)

Then we have the people who say that then we should call distros Kubuntu KDE/X/GNU/Linux, for example... but I think there's a line where we can say that KDE/X/GNU/Linux is just too much and GNU/Linux is ok: The minimum usable machine by a user would require GNU/Linux (I can work perfectly well on a GNU/Linux computer with bash and no KDE/X, so that makes the basic machine for me) while a machine with just linux (the kernel) would be pretty pointless as there's laking a system so that I could interact with it (the shell of the OS, at least).

Also I don't like to call things something they are not. For example, I wouldn't say "I like driving my wife's V4 16-valve 1.4 engine to work" as just the engine doesn't make up the whole car (yeah right, like she actually allows me to drive her car :-)). Of course, you can hear people bragging about their 5.0s, but sure as hell these people actually want to talk about _the engine_, not the whole car.

Linux (the kernel) is quite a nice thing. I'm still overwhelmed by its capability to run on the tiniest machines and the biggest supercomputers as well. How it's capable of running on all this different architectures, how you can basically hack it whatever ways you fill like to fit whatever need you have. To all of that (and more), I take my hat off and fill humble (and, trust me, that's not something I can say of many things or people:-)). But I think (that's me, personally, I'm not asking anyone to do something agains their will) GNU deserves being named alongside linux.

I'm thankful to all the people who have helped develop GNU (and linux) into what it is and so I call it GNU/Linux just like, though I use Kubuntu, I proudly wear a Debian cap (cause I know where Etcbuntu gets a lot of what makes it what it is)... and you will pry it off my cold dead hands.

And finally, to bring this chapter to an end from where it started: Thank you!

PS The cat in the picture is Tomás (after Tom from Tom & Jerry), my wife's pet.

viernes, 22 de enero de 2010

FF3.6 on ubuntu is not a reason why GNU/Linux is not ready for the mass-market


I have become quite a replicator lately, right?

Well, yet another article from an IT journalist/commentator I have to disagree with.

In the article the writer states that it's too difficult to get FF3.6 installed on Ubuntu and that it's reason enough to call GNU/Linux dead on its tracks to get to the mass market. That sounds compelling at first sight.... but

First: I bet the users of software that aren't quite up to date and that make up those huge botnets differ with the writer of the article. They all make up a part of the mass market as we know it, don't they? So it's OK to have outdated* software, isn't it? (I know, I know... it's not OK... but we are talking about mass market here, so go with the flow!).

Second: Remember that the way software is installed/maintained in the GNU/Linux world is completely different from Windows'. In Windows, as the writer said, you grab the software from internet (hopefully form a reliable location.... but we know that's not always the case, is it?), click on it, maybe will have to restart your computer.... a couple times (why the hell installing Adobe Reader requires you to reboot Windows? Is Adobe Reader the equivalent for Windows of glibc or something?) and then finally you are done with the software. In GNU/Linux, at least in Ubuntu (and every other distro that prides itself of being such), you have to wait for the maintainers of Ubuntu to review software to make it available. That's right.... they do that job for you, the user. And it's not just firefox that they maintain... they take care of thousands (literally) pieces of software to make them fit together and not mess with each other when you installed them on your beloved Ubuntu-powered box. And that not only sounds like a dauntin task... it really is. And what would be the equivalent of that in the Windows world? It would be like waiting for Microsoft to review the software when it's made available by its developers (have you seen how long it takes Microsoft to work on their own bugs? How long would it take them if they had to review other people's software as well?) and make it available to you through the centralized software they provided Windows with so that their beloved customers don't have to go leaping from site to site to grab the latest piece of malware-infested piece of software... oh, but there's no such thing for Windows, is there? Such a shame, you know.

So, in other words, FF3.6 is not made available in the stable Ubuntu release because it's going to be a major work to get it merged, but that doesn't mean there is no way to get it packaged so that our dear writer can use it. It didn't take me too long to find unstable/unsupported repos for FF3.6 (probably its stable enough, don't know for sure) for Ubuntu:


I'll personally wait for Ubuntu to make 3.6 available through their standard repositories... which I hope will happen for Jaunty... but maybe they won't and will make it available for lynx only... will have to wait and see.

Just so that it's crystal clear. This article doesn't state that GNU/Linux is ready for the mass market. I'm just stating that the writer-of-the-article's difficulty to install FF3.6 on Ubuntu is not an excuse to dismiss GNU/Linux's readiness for the mass market. Also, I do think GNU/Linux is ready for the mass market, but that's another quite different story.

That's it.

* FF 3.5 is not outdated, by the way. It will be maintained (at least, security-wise, by the Mozilla foundation for a while).

jueves, 21 de enero de 2010

How can people blame on GRUB if Windows doesn't like another bootloader?

I was reading this very interesting article on a guy who noticed that when going from Vista SP1 to SP2 windows would almost finish the process (taking quite a while, apparently) and then it would report an unknown error and rollback all the things that it had done (wasting CPU and real time, by the way). After seeing the problem show up a couple of times the person realized that grub was there in the MBR. Replaced the MBR with Windows', tried to update it it was done. Great.

So.... a very interesting read, I have to say. Then I hit the comments and what do we find? None other than people saying that it's grub's fault. Say what?

What's there for Windows to see that belongs to grub? Not much, really. The first phase bootloader, located in the MBR, in other words it's comprised within the first 512 bytes of the HD. The second phase bootloader (which is called from the first phase bootloader) is located somewhere within the GNU/Linux partitions set up at the box. So, the only thing from GRUB that Windows can actually see (unless Windows is capable of reading out of the box ext2, ext3, ext4 and the other gazillion FS that we have available in GNU/Linux) is the first phase bootloader.

In my opinion, it's something as simple as old Microsoft's motto in action: "It's the Microsoft way or the highway". The update process is taking a look at the MBR and notices that's it's not Windows' bootloader. "Who in their right mind would dare install something on the MBR that's not made by Microsoft?" I bet they think there at Redmond. End of the game, let's stop the update process... _and_ (specially) not tell the user what's going on. It wouldn't be as insulting if at least they would suggest the user to replace the MBR with Microsoft's tools. You know, it can be replaced with GRUB a couple of minutes later after shutting Windows down after the upgrade process is done... but what do we expect from a OS that was made to resemble black magic, anyway?

As I have already said before:
Windows equals esotericism
GNU/Linux equals determinism

martes, 5 de enero de 2010

A little bug fix for ADOdb (php) for MySQL


I'm using ADOdb to connect to more than one MySQL db at the same time but then i noticed that queries where being done on the last DB connection that was established. After researching for a while, i ended up modifying MySQL driver of ADOdb. It's a rather simple trick, so be free to use it as needed:

On adodb-mysql.inc.php, go to the definition of _query($sql, $inputarr=false) and change it to:

@mysql_select_db($this->database, $this->_connectionID);
return mysql_query($sql,$this->_connectionID);

That should do. By the way, I've only tested it connecting to a single mysql server (with two different DBs, obviously) but I think it should work with multiple DB servers as well. Also, It works like a charm when doing ->Execute() so it could be necessary to add that mysql_select in other places so that it worked with other functions that deal with the DB, but Execute() is enough for what I need so....

I did this patch on flisys, and I'll upload it in the following days.