January 11, 2012
MIC Interpreter.
3 Comments >>
MIC-1 is a microarchitecture defined by Andrew S. Tannenbaum in his book Structured Computer Organisation. Here at UNI we took a look at this architecture as during some classes...
As joke, I've asked during classes what do we get if we manage to write an emulator for it. Yes, entire micoarchitecture emulator. Luckuly, it's extremly simple, so it isn't hard to write one, however it does require a little bit more understanding of microarchitecture itself as it's for example solving some simple tasks. Anyhow, I got promised an 10 for practical part of this class (or A).
Needless to say, i've was more interested in brainfood that this task provided, rather than actual grade, however, well, a prize was nice (especially since this is a class that verry few gets such good grades - usually between 0-3 per year).
Grabbed a beer, wrote an interpreter, and later tackled the bigger problem. Writing an compiler for it. Since my compiler writing skills are nonexistent at best, the code of compiler pretty much reflects that. :-)
In any case, i'm putting this online ... Don't really have a good reason, just beacause I can probably.
Click me gently, please!
Filled under: None
January 5, 2012
Searching firefox browsing history like a boss, or otherwise known as using SQL.
0 Comments >>
So, I had need for some specific site I found whie ago (parsing related). I could not however remeber it's URL, and browsing history wasn't much help eiter as I couldn't remember site nor date when I last opened.
All I remembered it was personal site of somebody and that file I was looking for had an .c extension. And finding that site trough ~82.000 sites in my browsing history is not exactly something I would be excited about.
Sooo, is there any better method to search firefox history data rathen than what user interface that was provided by Firefox? Wooot, it is :)
Firefox stores most of it's data in SQLite database. In case of history, that's pretty much file: ~/.mozilla/firefox/blablablabla.default/places.sqlite
So, if we have SQLite tools installed that's all we have to do:
$ sqlite3 ~/.mozilla/firefox/blablabla.default/places.sqlite
And then we can execute almost any query we can imagine on that database, thus fucking around with history in any way we please. Like so:
n00b ~ >> sqlite3 .mozilla/firefox/x9fhowq6.default/places.sqlite
SQLite version 3.7.9 2011-11-01 00:52:41
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> SELECT COUNT(*) FROM moz_places;
82810
sqlite> SELECT url FROM moz_places WHERE url LIKE '%~%.c';
http://XXXXXX/~XXXXX/XXXXXXXXXXX.c
sqlite>
Ahh, finally found that piece of code.
Oh, PS: Happy 2012.
Filled under: None
December 14, 2011
Google doesn't simply walk into Mordor!
0 Comments >>
This one made me trully laugh! Google devs, do you have too much time again?

For those who don't know what's so funny: http://knowyourmeme.com/memes/one-does-not-simply-walk-into-mordor
Filled under: None
December 1, 2011
Streaming ALSA audio over the network.
0 Comments >>
So, my speakers exit at my stationary computer is pretty much fried. Kaput. Dead. There is nothing wrong with sound card, but just contacts at the exit are fried.
And my stationary computer is the only computer that holds vast ammount of music and i'm not too impressed at an idea of copying all that to laptop or mini (EEE PC). Not to mention the games. I want to play games as well.
So, what about transmitting audio data over the network? Bandwith should not be a problem. Does the linux has required infrastructure to do this EASILY? I don't want to install additionall software, thank you. It turns out it is.
Many sound cards have a so called loopback. This means that everything is meant to go out to speakers output also get's routed back into recording input, as if it came from external source. You just have to configure it properly.
On linux this is piece of cake. Just run the folowing commands, it will set Caputre source to mixer (configure loopback).
CONTROL="`amixer controls | grep 'Capture Source'`"
amixer cset "$CONTROL" "`amixer cget "$CONTROL" | grep "'Mix" | head -n 1 | sed -e 's/.*\#\([0-9]\).*/\1/'`"ΕΎ
Great, so sound card is now configured. Now fire up terminal at destination computer (computer where you will stream sound to), and type in the following:
nc -nvvlup 1234 | aplay
We're using UDP as a transport protocol, beacause of it's more suitable for low-latencies (if we lost few microseconds of audio, or few microseconds of audio are mixed i doubt you'll hear it) than TCP.
On source computer puch in terminal this, connect speakers to target computer, press enter and stand in awe on how easy it was.
arecord -t wav -f cd | nc -nvvu <target computer IP> 1234
This will transport audio over network in CD quality. It requires around 170KiB/s of bandwith. On local networks this shouldn't be a problem. Also play with arecord buffer options depending on what you want (nearly 0ms latenciy or better quality).
It's probably not the cleanest solution, but it's easy to setup and it works.
Oh, i guess i don't have to mention (i will do it anyways) do not feed output to itself as you can burn your speakers.
Filled under: None
November 24, 2011
Cheating on trackers?
0 Comments >>
This is not post to describe works of trackers in technical details, neither how to actually to cheat on them. There is enough material of that on google. It's just a post to to explaing general ways how trackers works and their relation to client. This knowledge is the key (besides imagination) for creation of quite awesome and bullet-proof cheating systems. :-)
It's not that in Slovenia there is no local BitTorrent trackers. All are semi-closed and some of then actually monitor your ratio (amount of downloaded vs amount of uploaded). As with many sites in this country, security of some is really fail, but still, breaking into their servers and modiyfing database is not exactly stealthy is it - not to mention it's a call for bigger trouble than just banned account. :-)
Some may argue: "Well, there are countless programs on creating fake ratio on bittorent trackers - ratiomaster being one example". That is no fun. Besides, you actually have to keep them running to create fake results. Lame! And it doesn't even solve hit'n'run problem1.
What is a purpose of BitTorrent tracker anyways?
In a modern sense - none. With DHT, PEX and similar technologies integrated into BitTorrent clients they're pretty much relic of history. But let's forget all that...
So, you download a torrent, open it BitTorrent client, what happens next? You know what P2P means, right (tip: if not, stop reading, this is not for you)? So from where does does client get list of peers sharing same file? From tracker of course.
Each torrent has attached some-form of ID. It's usually a hash of content (called infohash) to differentiate it from other torrents. So, to bootstrap BT client says to tracker: "Hey, i would like to download torrent with infohash <something>. Do you have list of peers that are also sharing this torrent?", and BT tracker says back: "Sure, here you have it. Have fun". From tracker side his job is done. He supplied you with list of clients that also share this infohash. It's up to client to connect to swarm and get that file from others.
But from where does the tracker get this list?
From other clients who asked for the same list, of course. So to next client who will ask him for peer list it will include also yours IP. Just because you asked him too.
He is not connected to BitTorrent network in any way. He doesn't have to be.
But how does then closed tracker knows who downloaded/uploaded how much and does per-user accounting?
He doesn't. Well, he does know, but he does not know if that information is true or not.
Wait, what? You're saying they're guessing?
No. See, let's say that semi-closed tracker has a torrent named ABC.torrent. User 1 and user 2 downloads this torrent. Does the checksums match when comparing torrent that User 1 downloaded with that of User 2?
No. Why?
For per-user accounting. Tracker modifies each torrent adding some form of obfuscated user-id to the end of URL for the tracker. This enables tracker to track each user requests for peer list by looking at this user-id and connecting it with the real username at database.
So for instance if user 2 sends you a torrent that he downloaded from semi-closed tracker, you can than upload/download torrents from that tracker under his identity. You just have to extract his user-identification-hash (tip: if he's not your friend you can get him banned that way. Or just leech on expense of someone else. You just have to get one torrent he downloaded from that site. His torrent that is).
You still didn't answer. How does he do per-user accounting?
By looking at what client told him. See, client doesn't just say: "Hey, i'm downloading a torrent with infohash blabla, give me peer list". He says this: "Hey, i'm downloading a torrent with infohash blabla, i have so far downloaded 1234 bytes and uploaded 213 bytes. Give me the peer list please?". And client doesn't do this once. He does this every once in a while (every 30 minutes usually).
This is to enable public torrent trackers to give out more efficient peer list to new clients and to already completed clients (for instance: new peer gets list of seeds, while seeds gets list of peers).
Of course on semi-closed trackers he also provides his identity for tracker to identify user who made that request.
So... that's how ratiomaster works?
Yes. You feed him the torrent, he get's tracker URL and infohash and consistently lies to tracker about how much he downloaded / uploaded. And tracker have no good way of knowing wheter he is lying or not.
Ratiomaster is not state-of-the-art program. It can be with some knowledge of tracker protocol scratched together in 30 minutes or less while being horribly drunk (or high, depending on your preference).
No good way? So you're saying there is a way for tracker to weed out liars?
Yes. There is. But it's similar to heuristics. Sometimes you fail, sometimes there is false alarm.
For instance you can from timing of requests and difference between uploaded/downloaded values calculate the the average speed with which user is downloading/uploading torrent. If that speed drastically change, it can be sign of ratiomaster.
They can also compare values given by all other users to find out that there is someone lying. The problem is, they can't figure out WHO is lying (since they don't know who to trust and who not - only thing that is known is that numbers mismatch). Of course this is not a problem if torrent is not active much (ie: almost 0 transfers happening) - it's easy to find who lies. ;-)
Only positive way of finding the liars would be to know what is actually hapenning inside the swarm - in other words to participate in it. But this would eat out resources of tracker, it's bandwith, make code 1000x more complex2 and put it legally questionable position.
None of these solutions are of course effective against well motivated / bored programmer.
Just ... How does tracker detects hit'n'run?
With imagination: it can't.
By standard: Easy.
You see, when you shut down your client, it also does one final request to tracker (even if it did one second ago). It essentially states: "Hey, i have been sharing torrent with infohash 1234 and downloaded X and uploaded Y bytes. But i'm stoping now, thank you".
This is to notify the tracker that his address is no longer valid and it should not be handed out anymore.
Private trackers can compare downloaded value and uploaded value on quit message and if they're out of specified values it can be registered as an hit'n'run.
So... What can i do with this knowledge?
Create no-torrent-does-affact-my-ration-ever-and-never-detect-hit-n-run-while-actually-downloading software. Out of the box thinking and searching for tracker protocoll specification is left as exercise for reader. :-)
You mean torrent client?
Oh god, no, that would be horrible waste of time.
What then?
It involves two words: Proxy & caching. And rewriting requests. That's four words!
Two sounded more drammatical, didn't it? :-)
Does it violates standards?
Obviously?
Share?
As I said before, figuring out the solution is left as an exercise for reader. Use some imagination for christ's sake. ;-)
1: Hit'n'run is when user closes their client immediatly after download is finish.
2: The primitive tracker software can be coded in less than 50 lines of PHP code. Been there, done that. Writing a tracker that will participate in swarm is another story.
Filled under: None
November 22, 2011
I can't help myself.
0 Comments >>
I should have stop screwing around and actually do something else.
It's a program we had to write at UNI. It's something like FTP, just with an exception that the protocol (we had to write server too) is extremly ugly. Ugly. No, seriously, ugly.
But, i couldn't help myself to include at least an stupid ... Well... Additional feature to annoy ususpecting user (user being an professor during my defence of this assigment)...

Fortunatly it didn't affect my grade. It was still 100%. Even altough it was half of the program written drunk and the other half with a huge hangover.
Oh, btw. mono rocks. I don't have to run my VM to do assigments :-)
Filled under: None
November 10, 2011
Just another reason why i love linux.
2 Comments >>
Apart from ability to fuck with it in any way I imagine. It's their amazing sense of humor developers retain even in situations where you should normaly panic.
I'm having actually two instalactions. Both of them are Arch linux. One is for regular use, another is for fuck-around, and if i want to see if GNOME3 has become more usefull since it's first release.
So, i was doing update of real installation from my testing setup (i'm writing this from GNOME3). No problem, just mount requred filesystems and do a chroot. But i forgot to mount /boot.
This is what i got:
(25/38) upgrading linux* [######] 100%
WARNING: /boot appears to be a seperate partition but is not mounted.
You probably just broke your system. Congratulations.
I still can't stop laughing. Brilliant.
Oh, and i'm slowly starting to like GNOME 3. It still has some stuff to do since 3.0, but progress in 3.2 is evident.
* linux = kernel.
Filled under: None
October 30, 2011
Setting up MCP with Minecraft Forge on Linux.
0 Comments >>
It's more note to myself, but here it goes:
In case if you fail to sucesfully install MinecraftForge onto MCP (Minecraft Coders Packs) under linux...
1.) Install MCP, then cd into extracted folder.
2.) Get a Minecraft.jar with installed Modloader and Modloader MP, place it in jars/
3.) If your default version of python is not 2.7, do this: sed -i 's/python/python2.7/g' *.sh
4.) Run decompile.sh
5.) unzip ../minecraftforge.zip
6.) cp -a forge/src/* src/
7.) cd src;
8.) find -name '*.java' -exec dos2unix \{} \;
9.) patch -p2 <../forge/minecraft.patch
You're done. Part 8 is important, otherwise patch may fail horribly. And entire recompilation aswell.
Oh, forgot to note. I'm staying on 1.7.3. I'll maybe upgrade someday to 1.8.1 - not beacause of features, beacause of new lighting engine. But i am not going to play 1.9. At least if list of upcoming features doesn't change... Or if they don't promise higher block IDs. Well, screw it, i'm rather backporting interesting mods.
Filled under: None
September 5, 2011
When the cat is away...
0 Comments >>
... the mice shall play...
It's quite funny how this applies to Minecraft SMP (Survival MultiPlayer).
Well, griefing is qute a problem on Minecraft servers. For those unfamiliar, since Minecraft is a sandbox game, this means that anyone can do anything. Including to other people creation.
It doesn't have to be destruction. It can simply be entering their house, without permission.
Of course, there are countless plugins for Bukkit, to combat this, but I found them quite CPU/Memory intensive. So i'm running a logging tool HawkEye, which logs into database, everything users do. Everything. It's inspiration was another Bukkit plugin called BigBrother1. So now you get idea, how monitored players can be.
Of course i had warned players that they are beign monitored. Not with big, red, screaming letters, but if they have read the rules, they are informed.
But it logs a SHITLOAD of stuff. And shoveling trough it is inpractical. So, i took a five minute break from my studies, srached together a simple script that takes as an input a center where players are mostly located and then finds all anomalies in their movement.
I'm ashamed to show code in this state2,3, however results are quite interesting >:-)
1: The one from 1984 probably.
2: Had crashed server once due to pushing MySQL and itself too far.
3: unset(), free() and similar are your good friends!
Filled under: None
August 17, 2011
Did rm -rf / and we're still working!
0 Comments >>
Let me just start with one quote:
Multitasking: The ability to screw up several things at once.
And since Linux is multitasking operating system powering from Laptops to Supercomputer down to mobile phones, it also have a secret (pssh, i didn't tell you that!) ability to fuck up several things at the same time! It happend by my input, but it still crashed pretty hard.
So, I have a nightly backup cycle with rdiff-backup. It creates incremental backups to save disk space. Every month, a new set of backups from scratch is created (old ones are just compressed into .squash filesystem and left lying on disk, untill I decide their faith1), just in case IF something somwhere fails.
So I was merging some old backups and I had them lying on another machine, simply beacause it operation demanded more space than i had available on server. Some NFS + /mnt magic, no problem. The job is done, files are copied back now it's time to wipe NFS mountpoint.
Since I was already doing several things at once, i cd'ed to /mnt (well, i thought i was) and typed:
nohup rm -rf * &
So, rm is remove, r is for reciursive (enter subdirectories) and f is for force (don't ask, just wipe anything on your way). Since I managed to mistype cd /mnt i was still stuck in the original directory. And this was root ( / ).
In other words, rm started to wipe EVERYTHING on the server itself. So, I had few chats open, write a response to them and go and check back to run htop ... Wait, wut?
:: /# htop
htop: error while loading shared libraries: libm.so.6: cannot open shared object file: No such file or directory
Wait, wut? Well, i think I did some screwing up with this package (i'm trying to run minecraft server and this was one of dependencies), well, let's reinstall it...
:: /# pacman -S libm
-bash: /usr/bin/pacman: No such file or directory
Wait, wut? What is this? Check $PATH. It's OK... Is it there? Wait, what?
:: /# cd /usr/bin/
:: /usr/bin# ls -l pacman
-bash: /bin/ls: No such file or directory
And then all of a sudden:
:: /usr/bin#
Connection to teh.zupa.cow.sez.mooo.com closed.
n00b ~ >> ssh -l root jbox
Connection closed by 93.103.205.91
At that point everything became clear. The system was being wiped. I had to think quick, so I jumped off my chair, threw myself across the room, trying to hit the power switch on server's PSU and turned it off. Got some cuts from landing in the process :-)
OK, so we stopped rm. Now what?
Well, let's dismount the server, take a disk out (altough i'm using custom kernel, it's a generic build so it will work on any x86 machine) and put it into my PC to see the extend of the damage2. It get's hooked up, booted and i get greeted with a nice hangup and only GRUB at top left corner3. Great, so at least 512 bytes of the entire system is still intact.
I popped in Ubuntu flash drive, booted, now what? Well, since root and boot partitions were probably wiped, there is no harm in doing fsck on them. Just to be on the safe side.
How about a data partitions, where backups are held? Should we run fsck? Common sense tells us: yes. But what if the backup procedure on it was runnig at the same time the wipe happend? This means that fsck could also wipe our (probably) damaged local backup. I also have offsite backups, but they're for emergency cases and getting to them is not an easy task4. Mounting a damaged filesystem could actually give us good info as to what is damaged and what not.
So i mounted it, despite mount.ext2 protests how i really should do fsck first. Checked root partition aswell. Half wiped. /usr almost completly gone, no signs of /lib or /bin or even /sbin. /boot was empty, that would explain GRUB's hanging.
So, how about backups. Looks like they're intact. At a glance. It was damaged, but not to unrecovable extend. Here's the stuff that got screwed up in the backup and was wiped by rm. Of course i couldn't have known about these missing files, if i would do fsck on this partiton before mounting.
- Kernel modules: Half of them missing and ext2 driver complaining when tried to access them. Not a big thing, i haven't updated kernel for a while so i can get them from old backups
- Locales for french, romanian, hungarian and some other languages. Won't be missing them.
- Few man pages. Not the end of the world, kinda anoying, will be fixed at next update of affected packages.
- Munin! Oh noes, my yearly server graphs are gone! Atleast that's what i thought. rm haven't got time to reach them, so restore from backups wasn't even necesary.
- pacman (packet manager) database got a bit screwed up. Lucky me, i'm running arch as an OS on server. Pacman is using verry simple filesystem layout for package database, so i just copied corrupted files from my laptop.
- Few unimportant logs i never bother to read.
- Rdiff-backup missing files and some diffs got wiped. This means that entire backup for this month is in inconsistent state. Better to start from scratch.
So i just copied all these stuffs from backup, wiped this month worth of backup (it's inconsistent anyways), rebooted and... Well, my PC booted up the same way as my webserver.
Heard BIOS beeps. OK... Heard GRUB beeps, few seconds later light for hard-drive started blinking. So far so good... And finally long 5 seconds beep indicating that server is booted up and running. Done some screwing around, works fine. So just popped disk back into server, hooked server back to the power and we're back in buisness.
Lessons leared:
- BACKUPS can save your ass! DO THEM.
- Offsite backups are even better. DO THEM BOTH, CLOUD FTW5.
- When SSH starts to fails you, power down the server. SSH (by my experience) is one of the last things that will fail on collapsing server. The only reason why I haven't lost the backup partition was beacause I reacted quickly enough when SSH stopped working.
- And most important: Linux is good at multitasking. You are not. ;-)
And nothing of value was lost. Except for my uptime record. I still feel sorry for it, now i'm gonna have to wait for another half a year...
Downtime: 60 minutes... Not bad for doing some other stuff while restoring backups, eh ? ;-)
1: I haven't deleted any of them, just merged to further save space. Yes, this means i still have entire system backups from, let's say 23. april 2010.
2: Server is completly headless - yes without any graphics card - so I have no way of knowing what's happening to it untill we get past GRUB. I have few spare cards lying around but I was in no mood for digging trough my drawers.
3: For any troubleshooters: if you get ONLY that this means that grub was unable to load stage1.5
4: Encryption and stuffz. Last thing i need is a quest for private keys. I do have them, but they're not stored in place easy to get to ;-)
5: Just don't forget to encrypt your backups before uploading it. No, ROT13 doesn't count as encryption. And don't store your private key on your gmail.
Filled under: None
July 17, 2011
IPTables MAC filtering and Rickrolling.
3 Comments >>
So, in some posts back i've described how to setup unprotected Ad-Hoc Wi-Fi network for puropose of simple connection sharing. I used the same setup at UNI, untill someone started to use my WiFi for downloading porn1.
So, what should I do to stop him? Lock the WiFI? But, that's no fun! Run a sniffer and send him a screenshot of his Facebook chat hisotry and profile settings? Already done, it's far more fun to watch when done in person. What else? Rickrolling! Yes! A bit chewed, abused and old, but it's a first thing that crossed my mind.2
Yes, let's do it! Since we've already setup our computer as gateway to the internet, we can pretty much do anything with traffic that passes trough us. Drop, edit, etc... The same effect can be done for instance if we're not actually the gateway to the internets via ... ARP spoofing for example.
So we download an webserver (we're going to host a copy of video ourselfs, since we're going to block access to outside network), let's say Apache.
OK, we download video3 from youtube in flv format, then we grab a copy of FlowPlayer or any other flash based players. If we're a big fan of HTML5 we can use just video html tag, but at a price of less support from browsers.
So the plan is the folowing:
If a computer from allowed mac addresses (my netbook) wants to access interents
it's traffic is passed trough
else
if it's tcp & port 80 (http)
rewrite destination address to our IP
else if it's DNS traffic
pass it trough
else
drop it
So request for http://www.facebook.com/troll/spam/eggs/ will be routed to to http://127.0.0.1/troll/spam/eggs/ transparently. Simple .htaccess file will help us fix problems with any 404s we might hit during the way.
ErrorDocument 404 /index.html
index.html is of course our mashup of flash, HTML, flv video and Rick Ashley. Looks like this (slashes before paths to file are important : not flow... but /flow...):
<html><head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<script type="text/javascript" src="/flowplayer-3.2.6.min.js"></script>
<title></title>
</head><body style="background: #adb2d8">
<center style="font-family: Verdana">
<br />
<a href="/rickroll.flv" style="display:block;width:648px;height:430px;border:1px solid #00f" id="player"></a>
<script>
flowplayer("player", "flowplayer-3.2.7.swf",{
plugins: { controls: null },
play: true
});
</script>
<div style="color: #005; padding-top: 20px; width: 648px;">
<b>The internets:</b><br />
<span style="font-size: 1.3em; font-weight: bold;">Serious business.</span>
</div>
</center>
</body></html>
Then comes the IPTables part. First, we're going to allow known "nice" MAC addresses trough. Then, we're going to redirect TCP:80 (http) to our router itself. It can also any other site on the interents, but it must respond well to different Host: than expected. This means, no, shared hosting is out of option.
We're going to block any access to the outside network. Anything that isn't TCP:80 or UDP:53 (dns) ain't comming trough. DNS must still be maintaned functional otherwise no client machine will be able to setup any connection anyhwere. We can also setup local DNS server and drop DNS altogether... But how to drop IPv4 traffic in PREROUTING table? -j DROP doesn't work. Redirect to nonexisting IP adress does and it has same efect.
It looks like that (replace wlan0 and MAC adress with the intefrace of internal network):
# Allowing traffic from whitelisted MACs pass trough.
# One of those pear each allowed MAC.
iptables -t nat -A PREROUTING -i wlan0 -m mac --mac-source 00:25:A3:78:BC:3D -j ACCEPT
# Redirecting TCP:80 to router loopback (127.0.0.1)
iptables -t nat -A PREROUTING -i wlan0 -p tcp --dport 80 -j REDIRECT
# Allowing DNS traffic to google public DNS ONLY.
iptables -t nat -A PREROUTING -i wlan0 -d 8.8.4.4 -p udp --dport 53 -j ACCEPT
# Redirect any other traffic to invalid adress, efectivly droping it.
# Only way attacker could have unobstructed access to internet would be
# to use IP over DNS (NSTX), or to take MAC of any computers on whitelist.
# Both choices bring it's own set of new problems.
iptables -t nat -A PREROUTING -i wlan0 -j DNAT --to-dest 0.0.0.0
And we're done. Just cat /var/log/httpd/access.log every once in a while and laugh at people's confusion. End result looks something like that. WWW pack can be downloaded here.
Just to get one thing straight. Trough this post i used a word "router" a lot. Router is meant as any device that pass packets between network. It's probably your laptop, if your laptop provides access to internet.
Disclaimer: You may rickroll yourself during setup. Author did this to himself 2 times during writings of this post.
1: Among other things.
2: A not yet executed on a grand scale. At least by me.
3: I seriously hope you didn't click on that link.
Filled under: None
July 15, 2011
CleverBot vs CleverBot showdown.
0 Comments >>
So, for all of you who haven't yet heard of CleverBot, it is an A.I. chat bot. It claims to be pretty clever (as it's learning from real people) so I was wondering how it would end up, if two of CleverBot were talking to eachother.
However, at the site they tried to do anything to make automatic interaction with bot as pain in the ass as possible. But hey, i'm a geek, why should this stop me? So I fired up Firefox, Firebug and Kate and started to reverse-enginner their JavaScript powered front-end.
The problem is that you can't just emulate a browser at HTTP level and send everything as POST, as you actually have to run some JavaScript doing some arithmetic magic and send the result back to server. Nice.
So after trying to figure out what the fuck do they expect from me, I gave up and started to fuzzy the shit out of that code. I've learned that i've wasted 2 hours trying to figure out what it does, since ... Well, it's output was almost always the same. Hell, it could be emulated with 1 line of PHP code. Neato.
So, I got 50 lines of code fo communicating class and... Well, about 10 lines for part that actually makes two bots talk to eachother. Great, now what? People tried to troll CleverBot for years, pumping so many bullshit into it, so ... It would be awesome if CleverBot actually managed to troll itself.
So I created thread1 on /g/, published the code and almost fell of the chair as I saw how CleverBot can actually sometimes produce meaningfull result, sometimes simply hillarious, or even naturally come to "singing" parts of a song when talking to itself. Oh, and it proven once for all that computer can have cybersex2 with another computer (or itself). Note, that this was excusevly bot-bot communication without any human interaction.
So, for all you who'v actually missed the fun, i've setup a site where two cleverbots talk to eachother. It's over here. As someone dug it out and posted it on 4chan it got almost completly flooded so it might run a bit slow. Nevertheless if you ever wonedered how it would look if AI talked to itself, here is you chance. Or you can just laugh and browse/search the logs.
http://n00bz.pwnz.org/cbot2cbot/index.php
Well, if you want to, you can also download the source class as well... It's no big deal, but it'll save you some hours reversing cleverbot JavaScript. Click me softly.
But for then end, here is a quote from wikipedia:
Cleverbot differs from traditional in that the user is not holding a conversation with a bot that directly responds to entered text. Instead, when the user enters text, the algorithm selects previously entered phrases from its database of prior conversations. It has been claimed that "talking to Cleverbot is a little like talking with the collective community of the internet".
http://en.wikipedia.org/wiki/Cleverbot
So based on what results it was returning on /g/ and in logs of my site ... Well, I guess that pretty much sums the internet up. For example this.2,3
EDIT: Wow, 500 downloads in two days?
EDIT2: Search over the logs online.
EDIT3: It took 48 hours for some mods at netgexupdate to rip it off. Oh, internet. I'm not suprised neither mad, just find it funny.
1: Someone actually thought it was funny to archive it.
2: Seriously, what sick fuck tougt him that?
3: I have to repeat, see 2.
Filled under: None
July 12, 2011
Automatically generating tag cloud from text using PHP.
4 Comments >>
This is an part of post from old blog. It was written as an introduction the "Rickrolling on open-wifi'n'stuffz". Should come soon.
I guess it's no big secret that i'm a inherently lazy person. And content on this blog will have to get organised sooner or later. So, how to organize it with the least ammount of effort? Categories? No! Plus, they provide no benefit with SEO. I'll still implement them, as is it tested method. But i'm still searching for something better.
Tags? Yes, tags! They are the future! Easy, you can add them to virtually anything you think, plus you can use them directly to help search enginges crawl your site. Awesome, right?
Just, they're heavier than categories, as you actually have to think what appropriate tags should be. As my laziness duty calls, i started to wonder if there is any way computer could do this for me? It doesn't have to guess right in 99,999% of cases, just to throw some basic keywords out of a given text would be nice. I can add or remove few keywords later on...
So, a simple algorithm counting ammount of number would do the trick. Apart from filtering few conjuction words for example. Or adverbs... Basically words that doesn't contribute much to the content. This can be easily solved using a dictionary.
Well, let's cound some words... This written snippet should do the trick...
function gentags($text, $minlen = 2, $threshold = 2, $maxwords = 25)
{
// First, some cleanup!
$text = strtolower(strip_tags($text));
$text = preg_replace(array("/[^a-zA-Z0-9\s]/", "/\s\s+/"), array("", " "), str_replace("\n", " ", $text));
// Include the "forbidden" words.
$forbidden = file("ignore.words");
// Count zee words!
$wres = array_count_values(str_word_count($text, 1));
// And now weed out the rats.
$words = array(); $w = 0;
foreach ($wres as $word => $occurance)
{
if (strlen($word) <= $minlen) continue;
// Look if its forbidden words. Actually ignore words shoud be more called
// ignored prefixes. But no matter.
foreach ($forbidden as $fword)
{
$fword = str_replace("\n", "", $fword);
if (strlen($fword) > strlen($word)) continue;
if (substr($word, 0, strlen($fword)) == $fword) continue 2;
}
if ($occurance > $threshold) @$words[$word] = $occurance;
}
arsort($words);
// Get the first $maxwords words
$words = array_chunk($words, $maxwords, true);
// Returning array is sorted key(word)->value(occurence count)
return $words[0];
}
OK, so let's see in pratice. For previous post, the keywords returned are:
email (occured: 8 times)
facebook (occured: 6 times)
google (occured: 5 times)
circle (occured: 4 times)
update (occured: 4 times)
send (occured: 3 times)
make (occured: 3 times)
post (occured: 3 times)
allow (occured: 3 times)
status (occured: 3 times)
OK, it guessed pretty well. How about the one before that?
code (occured: 4 times)
blog (occured: 4 times)
some (occured: 3 times)
still (occured: 3 times)
new (occured: 3 times)
not (occured: 3 times)
codebase (occured: 3 times)
complete (occured: 3 times)
OK, should do the trick. And now that that we have done most of the heavy lifting we can also have a little bit of fun and also draw a tag cloud. Why not? :-)
function gen_tag_cloud($tags)
{
// Get the value of the most occuring.
$tmparr = $tags;
$valmax = array_shift($tmparr);
$valmin = array_pop($tmparr);
// Some mumbo-jumbo, just so that tags don't become toobig or too small.
$spread = $valmax - $valmin;
if ($spread == 0) $spread = 1; // http://goo.gl/yfeVG
$step = ($max_size - $min_size) / ($spread);
$keys = array_keys($tags);
shuffle($keys);
$tags = array_merge(array_flip($keys), $tags);
$src = "";
foreach ($tags as $word => $value)
{
$fsize = round($min_size + (($value - $valmin) * $step));
$src .= '< span style="font-size: '.$fsize.'px;" >'.$word.'< /span > ';
}
return $src;
}
Th snippets and a demo are available here. It works good only in English. Try some Slavic languages (for example Slovene or Croatian) and it'll fail miserably.
Edit: updated a code to use str_word_cout as suggested by Bergi.
Filled under: None
July 11, 2011
Posting from Google+ (plus) directly to Facebook wall
0 Comments >>
So, let's say i've simply couldn't resist all the buzz around Google+, so i've got myself an invite and checked it out. I'm actually qute suprised (in positive way) how they pulled it off, if it takes off i'm giving a boot to FB. I really like the circle stuff (around most of the buzz have been in media anyways), as I have few people on Facebook, who really shounldn't view every post I make. So I usually just don't make such a post.
But I came across an interesting feature, that allows me to make status updates from Google+ that are instantly visible on Facebook as well.
Facebook allows users to update their status by sending an e-mail to user-specific e-mail address. Yup, and you've guessed, Google+ allows you to define people in circles that aren't actually present on the site. So, google will just send them an e-mail.
So, here's how to set it up:
- Login to Facebook, and then go here.
- Scroll down a bit, you should see an "Upload via e-mail". There is your e-mail to wich you can send, well, e-mails, that cause update on your wall. It's in a form of dialogXXXXXX@m.facebook.com
- Create a new circle in Google+, add only the folowing e-mail as an contact. Just paste an e-mail in it.
- Everything you now share with this circle, is also posted on Facebook.
Altough it's a nice feature, it does have some drawbacks. Currently status update must be 50 characters or less. Anything more will be truncated by Facebook. Which kinda sucks, as you have even less space than at twitter, but, hey, it's better than nothing, isn't it? :-)
Oh, almost forgot. Anyone need an invite?
Filled under: None
July 9, 2011
Knuplez is back and this blog with him!
7 Comments >>
Knuplez iz back!
First thing, i'm no longer writing in slovene. Let's try something different shall we? I've been spamming the internet for three years in slovene and it ended rather horrible. Almost working codebase, completly inconsistent content, that looked like 4chan had been on visit etc... So I consider myself a bit more grown-up1, since the times I first started this blog as a 15 years old kid. This of course means, i will not make new post every 5 minutes, as i'm not good-content-factory. :-)
Oh, did you notice it's a completly new design? Yes, it is! Minimalistic, as it should be. Well, at least I always wanted minimalistic design that didn't look like complete crap. I'll be honest tough, it's not entierly mine, just took the basic setup and attached shitload of CSS.
I can haz new blag engine too. nBlog2 and as said in my TO-DO it's codebase is resuable. Actually went as far and wrote own tiny URL Routing "framework" (a bit similar to way CodeIgniter handles things, but i don't want to carry CI baggage with it2 - framework itself is only 50 lines of code) and build blog on top of it. So I got a extensible codebase, plus a lot of code that can be reused in another projects. Suprisingly code is even smaller than with original nBlog, yet offering almost similar features. Who knows, maybe even CMS will come out of this.
Blog is not complete yet. No categories for instance. I'll crunch that code together when I get some time, still have some exams at UNI to complete. It still should be just few 10 lines, but still... And i've gotta buy myself a real domain. :)
During course of next few weeks i'll probably translate some old blog posts into english and reblog them here. With correct timestamp that is.
1: If nothing else, i've at least learned that Dire Straits is better than any other rock band i've heard so far. That counts for something, doesn't it?
2: Not that CI is bloated. It's just not minimalistic enough for my taste.
Filled under: None
June 16, 2011
Setting up ad-hoc wireless internet sharing on linux
0 Comments >>
So, here at my dorm rooom we enjoy fast 100mbit internet connection (per student, not shared). Great, but you can't actually pump that much of data trough wireless so they simply didn't build infrastructure for it.
As I have usually laptop and a netbook with me here at UNI and use both I somehow have to connect both to the internet. I'm too lazy to actually carry a switch with me, so I just connected a netbook to a wired connection, created unprotected ad-hoc wireless network and used my laptop wi-fi to connect trough it. Why ad-hoc? It's universall, all chipsets supports it.
I will not now dig into technical diffirences between ad-hoc wireless networks and infrastructure types. For simple uses as simple internet sharing, copying some files or small-scale lan-party it doesn't matter.
Setup for internet connection sharing using ad-hoc network is actually pretty simple if you're not scared of terminal. Altough I belive it can be done also easily in Ubuntu, i was never too much of fan of Network Manager1 and prefer to do it in my own way. We need to install dnsmasq first. We can install it the folowing ways (for other distributions, take a look at manual):
Ubuntu:
sudo apt-get install dnsmasq
Arch Linux:
pacman -S dnsmasq
So, now we have dnsmasq. Gret, let's set it up. Open up a text editor (as root) and edit the file /etc/dnsmasq.conf . You don't need a lot, just the folowing lines:
interface=wlan0
dhcp-range=10.13.37.50,10.13.37.150,255.255.255.0,12h
dhcp-option=6, 8.8.4.4 8.8.8.8
This will set DHCP/DNS server to hand out IP addresses in the range of 10.13.37.50-150, with netmask /24 (255.255.255.0). For DNS we'll use Google public DNS. So now it comes setting up wireless network. First we have to bring it down, configure it, and set it back up. As root of course.
# Bring the wireless interface down
ifconfig wlan0 down
# Set it as an ad-hoc with SSID of "FooCorp"
# on channel 11 (warning, channel number matters!)
iwconfig wlan0 mode ad-hoc
iwconfig wlan0 essid "FooCorp"
iwconfig wlan0 channel 11
# Bring it back up with /24 private network
ifconfig wlan0 up 10.13.37.1/24
Wait for few seconds to get it up. Aftert you can try with iwconfig command and see if it worked. Output should look something like this:
wlan0 IEEE 802.11bgn ESSID:"FooCorp"
Mode:Ad-Hoc Frequency:2.462 GHz Cell: 1E:97:D2:22:FA:92
Tx-Power=14 dBm
Retry long limit:7 RTS thr:off Fragment thr:off
Encryption key:off
Power Management:on
Pay attention to ESSID and Cell values. General rule of thumb is that ESSID must be filled with your supplied ESSID and first value in cell must not be 00. In case if it does't, it might be networkmanager. If you're using ubuntu, you're using it. Try to disable it with simple command executed as root:
stop network-manager
Now we must startup dnsmasq and configure iptables to enable network fowarding. Easy stuff
Ubuntu:
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERDAE
echo 1 >/proc/sys/net/ipv4/ip_forward
/etc/init.d/dnsmasq restart
Arch Linux:
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERDAE
echo 1 >/proc/sys/net/ipv4/ip_forward
/etc/rc.d/dnsmasq start
This comes as a router part. Now we must connect to it. Windows should be able to connect using standard setup (at least tested on 7). For Linux might also works, depending upon the mood of networkmanager. In case if it doesn't, there is a way to connect computer manually to the network.
First we must shut down networkmanager - already showed how. Then we repeat same step we used at configuring ad-hoc network on our router (same commands). We must verify if it actually connected so we just fire up iwconfig. If Cell numbers match, well, then we're in business. Finally we fire up DHCP client and setourselfs shiny new private IP address using command:
Ubuntu:
dhclient wlan0
Arch Linux:
dhcpcd -K wlan0
Note that these steps can also be used when we wan't to create a wireless network on-the-go just for sharing some files or playing games. With some colleauges from UNI we've used this method to play Flatout22 during our daily commute to UNI and back.
1: The underlying system that handles the network configuretion in many distributions. Always lefts bitter taste. Ironically i'm using it beacause KDE has sexy front-end.
2: Yes it runs flawlessly under wine.
Filled under: None